NetApp did confirm that NDMP went from V3 to V4 from 6.1.x to 6.2.x. I tried forcing the Filer back to NDMP V3. It raced the processors up to 100% and stayed there. I had fewer backups running as well. I can't explain that behavior other than something in how DOT talks to NDMP must have changed. Have you tried this and what were your experiences?
Thank you for your information.
-----Original Message----- From: Stephane Bentebba [mailto:stephane.bentebba@fps.fr] Sent: Wednesday, February 12, 2003 7:35 AM To: Jay Newton (Email) Cc: 'toasters@mathworks.com' Subject: Re: NDMP issues with 6.2.x of DataONTAP
Jay Newton (Email) wrote:
We are experiencing issues with NDMP backups. I'm hoping other people in this group have seen it as well and may have suggestions for us to try. Here's an explanation of what has happened.
On 12-13-2002 we upgraded from 6.1.3r2 to 6.2.1r2 to fix issues with Autosupports not being sent in the event of hardware failure. A couple of things went sour after the upgrade. Our SnapManager for Exchange performance went down more than 30%, NDMP backups would run slow intermittently, and general Filer performance got worse. NetApp discovered a memory leak in 6.2.1r2 in the NDMP daemon about the same time we upgraded to 6.2.1r2. We had to upgrade to 6.2.2d8 to fix that issue. However, a backup or two will invariably run slower than the rest. If the backup is restarted, it usually picks up speed and runs normally. By slow I mean 10 gigs in 10 hours on a DLT8000 drive. I used to get 20-30 gig per hour before we upgraded to 6.2.1r2. This can happen with each of the 4 DLT8000 drives attached to the Filer meaning that I can't pin the problem to a bad piece of hardware.
Today we have discovered that snapshots for backup are not deleting correctly. The NDMPD process was holding the snapshot hostage. A volume ran out of space due to this issue. We were able to kill the NDMPD sessions that were holding the snapshot open and the snapshot deleted normally.
Has anyone else experienced similar issues?
For those of you running SnapManager for Exchange, how big are your databases and how long does it take to verify them? Also, those who run multiple backups, what is your CPU utilization like when running 4 simultaneous backups and what is your tape throughput?
DataONTAP 6.2.2d8 Commvault Galaxy 3.7.1 SP4 ATL P2000 library with 4 DLT8000 drives attached to Filer on 2 SCSI HVD controllers (2 drives per controller)
Thanks!
Jay Newton Systems Engineer Chesapeake Energy Corporation Natural Gas - Natural Advantages Building 6112, Room 114 (405)848-8000 ext. 683 jnewton@chkenergy.com
I am not sure of what I say but, try to figure out if ndmp max version of Ontapp didn't switch from 3 to 4 beetween your different Ontapp . if so, try to force the ndmp max version back to 3 (in case it is not fully supported by your backup software application) with this command : ( ndmpd version gives you the current max version) ndmpd version 3 then make a try and decide ( to set back the max version to 4, type ndmpd version 4 )
from my point of view, it could explain our performance problem and more certainly your zombie ndmp sessions.
No, I never tried to limit the ndmp version as I advise it to you. And at this point, I am sorry to tell you I can't figure out what really could be done at this point. In fact, I agree with you it's a change in DOT's NDMP implementation that gives all this garbage. In last ressort, I could perhaps advise you to upgrade Ontapp again : 6.3 got out very near from 6.2. You can also check that despite 6.3 is FCS, most of Filer are running on it. I don't say this version is more stable but I rather say that we don't have enought experience on 6.2 to know all possible bugs. Keep in mind you could always have to downgrade your Filer version, especially if your Filer is not a powerfull one : each time a new Ontapp is released, it seems to need more and more ressource. You wrote your CPU had gone up to 100%, that's what made me think about it. You could perhaps open a case with Ontapp on that matter and see what they think about it : if they advice you to upgrade, go on. If you stay in this version, you would probably have to save a log of all ndmp sessions (with 'ndmpd debug 50' or 'pktt start e0'). Are you sure you made only an Ontapp upgrade (no upgrade for the Commvault Galaxy ? neither switch or cables rearengment ? sometimes a faulty cable introduce you in error in performance problems).
Hope you could get throught this problem.
Jay Newton (Email) wrote:
NetApp did confirm that NDMP went from V3 to V4 from 6.1.x to 6.2.x. I tried forcing the Filer back to NDMP V3. It raced the processors up to 100% and stayed there. I had fewer backups running as well. I can't explain that behavior other than something in how DOT talks to NDMP must have changed. Have you tried this and what were your experiences?
Thank you for your information.
-----Original Message----- From: Stephane Bentebba [mailto:stephane.bentebba@fps.fr] Sent: Wednesday, February 12, 2003 7:35 AM To: Jay Newton (Email) Cc: 'toasters@mathworks.com' Subject: Re: NDMP issues with 6.2.x of DataONTAP
Jay Newton (Email) wrote:
We are experiencing issues with NDMP backups. I'm hoping other people in this group have seen it as well and may have suggestions for us to try. Here's an explanation of what has happened.
On 12-13-2002 we upgraded from 6.1.3r2 to 6.2.1r2 to fix issues with Autosupports not being sent in the event of hardware failure. A couple of things went sour after the upgrade. Our SnapManager for Exchange performance went down more than 30%, NDMP backups would run slow intermittently, and general Filer performance got worse. NetApp discovered a memory leak in 6.2.1r2 in the NDMP daemon about the same time we upgraded to 6.2.1r2. We had to upgrade to 6.2.2d8 to fix that issue. However, a backup or two will invariably run slower than the rest. If the backup is restarted, it usually picks up speed and runs normally. By slow I mean 10 gigs in 10 hours on a DLT8000 drive. I used to get 20-30 gig per hour before we upgraded to 6.2.1r2. This can happen with each of the 4 DLT8000 drives attached to the Filer meaning that I can't pin the problem to a bad piece of hardware.
Today we have discovered that snapshots for backup are not deleting correctly. The NDMPD process was holding the snapshot hostage. A volume ran out of space due to this issue. We were able to kill the NDMPD sessions that were holding the snapshot open and the snapshot deleted normally.
Has anyone else experienced similar issues?
For those of you running SnapManager for Exchange, how big are your databases and how long does it take to verify them? Also, those who run multiple backups, what is your CPU utilization like when running 4 simultaneous backups and what is your tape throughput?
DataONTAP 6.2.2d8 Commvault Galaxy 3.7.1 SP4 ATL P2000 library with 4 DLT8000 drives attached to Filer on 2 SCSI HVD controllers (2 drives per controller)
Thanks!
Jay Newton Systems Engineer Chesapeake Energy Corporation Natural Gas - Natural Advantages Building 6112, Room 114 (405)848-8000 ext. 683 jnewton@chkenergy.com
I am not sure of what I say but, try to figure out if ndmp max version of Ontapp didn't switch from 3 to 4 beetween your different Ontapp . if so, try to force the ndmp max version back to 3 (in case it is not fully supported by your backup software application) with this command : ( ndmpd version gives you the current max version) ndmpd version 3 then make a try and decide ( to set back the max version to 4, type ndmpd version 4 )
from my point of view, it could explain our performance problem and more certainly your zombie ndmp sessions.