Hey Paul.
Because I never said I was building a "quick disaster recovery" solution. :) The recovery system I'm building is more of a "at least we've got another copy" solution. We don't have cash for a nearline which leaves us in a hole. I'm looking to temporarily fill that hole by leveraging old 840's to at least keep a copy of the data on untill we can one day cough up the cash for a proper nearline. I'm wanting to use NDMP perhaps predominantly because this is what it was intended to do. SnapMirror and SnapVault are undoubtably the better solutions, but I'd like to try and utilize NDMP rather than just give up on it as a slow useless system of backup/recovery. If NDMP would just run at the speeds that the filers are capable of I'd be doing ok. I'm leaving Snapmirror/Snapvault off the table for now.
benr.
-----Original Message----- From: Paul Galjan [mailto:galjan@gmail.com] Sent: Wed 1/5/2005 4:52 PM To: Ben Rockwood Cc: toasters@mathworks.com Subject: Re: NDMP Tuning Hi Ben,
I'll be the first to say that this doesn't answer your question, but why are you using NDMP for quick disaster recovery? I would think that SnapMirror or SnapVault would be much more accomodating to DR requirements... My guess would be that a block level copy with VSM would be much more efficient...
I would ask your Sales rep or SE for an eval snapmirror license.
--paul
On Wed, 5 Jan 2005 16:26:32 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Happy New Year Toasters.
Does anyone have experience with tuning NDMP? I'm not sure how much tuning is possible, but I'm trying to work out some serious slowness in NDMP Level 0's.
Plenty of people have had these issues before but I'm not finding solutions on NOW or in forums. Here is a time breakdown of a L0 I did: 5 hr 32 min total 35 minutes in Pass I & II 14 minutes in Pass III 3 hr 38 min Pass IV (Stage 1, Creation) 1 hr 21 min Pass IV (Stage 2, Copy) Unknown in Pass V
These numbers are rough based on timestamps during the NDMPcopy itself. The total transfer is about 58G from one 760 to another. It's the first stage of PassIV that really bothers me. During this first part of the pass there is very low CPU utilization and little IO. I need to speed up the process. Since the destination is a recovery filer and not serving data I don't care if it's CPU gets slammed or IO is pushed through the roof, I just need it done quicker.
Is it throttling or can I some how speed it up? I'm using gig as the interconnect but as I understand it Pass IV Stage 1 is all about inode creation whereas Stage2 is the actual data transfer. The data transfer rate is roughly averaging 11MB/s between the two filers which is less than I'd like to see as well, the filer should be capable of handling a tranfer rate of 30MB/s pretty easily.
Any hints or tips from the experienced? This is effectively a test setup before implementing a recovery system on our production 940's in which we'll be moving nearly 7TB of data. Given my findings so far it's going to be pretty nasty.
benr.
Cool then.
In that case I would look at rsync and/or robocopy (in a windows only env). Not that rsync is a block level protocol (it evaluates on the file level), but perhaps it would provide better performance with smaller backup windows?
To put a better point on it: NDMP is just a wrapper around the UNIX dump command. It's no better, nor worse than it, and that's the reason I asked. The dump command (and NDMP by extension) is for backup, not DR. It is a clunky protocol in terms of straight replication, and that's why Netapp and others offer alternatives for replication...
In the end though, we should get to your problem: how many inodes are we looking at? And what happens in Pass 4, Stage 1? The inode number would be my first suspect.
--paul
On Wed, 5 Jan 2005 17:12:25 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Hey Paul.
Because I never said I was building a "quick disaster recovery" solution. :) The recovery system I'm building is more of a "at least we've got another copy" solution. We don't have cash for a nearline which leaves us in a hole. I'm looking to temporarily fill that hole by leveraging old 840's to at least keep a copy of the data on untill we can one day cough up the cash for a proper nearline. I'm wanting to use NDMP perhaps predominantly because this is what it was intended to do. SnapMirror and SnapVault are undoubtably the better solutions, but I'd like to try and utilize NDMP rather than just give up on it as a slow useless system of backup/recovery. If NDMP would just run at the speeds that the filers are capable of I'd be doing ok. I'm leaving Snapmirror/Snapvault off the table for now.
benr.
-----Original Message----- From: Paul Galjan [mailto:galjan@gmail.com] Sent: Wed 1/5/2005 4:52 PM To: Ben Rockwood Cc: toasters@mathworks.com Subject: Re: NDMP Tuning Hi Ben,
I'll be the first to say that this doesn't answer your question, but why are you using NDMP for quick disaster recovery? I would think that SnapMirror or SnapVault would be much more accomodating to DR requirements... My guess would be that a block level copy with VSM would be much more efficient...
I would ask your Sales rep or SE for an eval snapmirror license.
--paul
On Wed, 5 Jan 2005 16:26:32 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Happy New Year Toasters.
Does anyone have experience with tuning NDMP? I'm not sure how much tuning is possible, but I'm trying to work out some serious slowness in NDMP Level 0's.
Plenty of people have had these issues before but I'm not finding solutions on NOW or in forums. Here is a time breakdown of a L0 I did: 5 hr 32 min total 35 minutes in Pass I & II 14 minutes in Pass III 3 hr 38 min Pass IV (Stage 1, Creation) 1 hr 21 min Pass IV (Stage 2, Copy) Unknown in Pass V
These numbers are rough based on timestamps during the NDMPcopy itself. The total transfer is about 58G from one 760 to another. It's the first stage of PassIV that really bothers me. During this first part of the pass there is very low CPU utilization and little IO. I need to speed up the process. Since the destination is a recovery filer and not serving data I don't care if it's CPU gets slammed or IO is pushed through the roof, I just need it done quicker.
Is it throttling or can I some how speed it up? I'm using gig as the interconnect but as I understand it Pass IV Stage 1 is all about inode creation whereas Stage2 is the actual data transfer. The data transfer rate is roughly averaging 11MB/s between the two filers which is less than I'd like to see as well, the filer should be capable of handling a tranfer rate of 30MB/s pretty easily.
Any hints or tips from the experienced? This is effectively a test setup before implementing a recovery system on our production 940's in which we'll be moving nearly 7TB of data. Given my findings so far it's going to be pretty nasty.
benr.
Paul Galjan wrote:
Cool then.
In that case I would look at rsync and/or robocopy (in a windows only env). Not that rsync is a block level protocol (it evaluates on the file level), but perhaps it would provide better performance with smaller backup windows?
Rsync is certainly a possiblity. I'm afraid I'd have some problems being as in this enviroment the filers are being used CIFS only, which makes file level interaction for an old UNIX zealot like me less than entertaining.
To put a better point on it: NDMP is just a wrapper around the UNIX dump command. It's no better, nor worse than it, and that's the reason I asked. The dump command (and NDMP by extension) is for backup, not DR. It is a clunky protocol in terms of straight replication, and that's why Netapp and others offer alternatives for replication...
In the end though, we should get to your problem: how many inodes are we looking at? And what happens in Pass 4, Stage 1? The inode number would be my first suspect.
Right. I haven't looked at the code itself to see exactly what it's doing (I probly should at some point) but Stage 1 of Pass IV seems to be all about inode creation prior to copying in all the data. The source volume has 2.3million inodes in use. That does't seem like an outragious number, and this is a pretty small filer all things considered. How creation of 2.3 million inodes can consume 3 hours is beyond my understanding. During that time the destination filers CPU is nearly idle. The only explanation I can dream up is that the proccess of creating inodes is happening so quickly that the bulk of system time is spent in context switches not in execution, and hense a false sense of idle-ness... but thats a pretty BS explanation since even if that were the case it still wouldn't take 3 hours.
All the evidence I've seen thus far with NDMP suggests that I just need to turn up the flow. Is there an idle loop in the dump code? Thats exactly what it feels like. Anyone know is there is an OnTap equivelent to truss?
benr.
--paul
On Wed, 5 Jan 2005 17:12:25 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Hey Paul.
Because I never said I was building a "quick disaster recovery" solution. :) The recovery system I'm building is more of a "at least we've got another copy" solution. We don't have cash for a nearline which leaves us in a hole. I'm looking to temporarily fill that hole by leveraging old 840's to at least keep a copy of the data on untill we can one day cough up the cash for a proper nearline. I'm wanting to use NDMP perhaps predominantly because this is what it was intended to do. SnapMirror and SnapVault are undoubtably the better solutions, but I'd like to try and utilize NDMP rather than just give up on it as a slow useless system of backup/recovery. If NDMP would just run at the speeds that the filers are capable of I'd be doing ok. I'm leaving Snapmirror/Snapvault off the table for now.
benr.
-----Original Message----- From: Paul Galjan [mailto:galjan@gmail.com] Sent: Wed 1/5/2005 4:52 PM To: Ben Rockwood Cc: toasters@mathworks.com Subject: Re: NDMP Tuning Hi Ben,
I'll be the first to say that this doesn't answer your question, but why are you using NDMP for quick disaster recovery? I would think that SnapMirror or SnapVault would be much more accomodating to DR requirements... My guess would be that a block level copy with VSM would be much more efficient...
I would ask your Sales rep or SE for an eval snapmirror license.
--paul
On Wed, 5 Jan 2005 16:26:32 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Happy New Year Toasters.
Does anyone have experience with tuning NDMP? I'm not sure how much tuning is possible, but I'm trying to work out some serious slowness in NDMP Level 0's.
Plenty of people have had these issues before but I'm not finding solutions on NOW or in forums. Here is a time breakdown of a L0 I did: 5 hr 32 min total 35 minutes in Pass I & II 14 minutes in Pass III 3 hr 38 min Pass IV (Stage 1, Creation) 1 hr 21 min Pass IV (Stage 2, Copy) Unknown in Pass V
These numbers are rough based on timestamps during the NDMPcopy itself. The total transfer is about 58G from one 760 to another. It's the first stage of PassIV that really bothers me. During this first part of the pass there is very low CPU utilization and little IO. I need to speed up the process. Since the destination is a recovery filer and not serving data I don't care if it's CPU gets slammed or IO is pushed through the roof, I just need it done quicker.
Is it throttling or can I some how speed it up? I'm using gig as the interconnect but as I understand it Pass IV Stage 1 is all about inode creation whereas Stage2 is the actual data transfer. The data transfer rate is roughly averaging 11MB/s between the two filers which is less than I'd like to see as well, the filer should be capable of handling a tranfer rate of 30MB/s pretty easily.
Any hints or tips from the experienced? This is effectively a test setup before implementing a recovery system on our production 940's in which we'll be moving nearly 7TB of data. Given my findings so far it's going to be pretty nasty.
benr.
Hey, you've always got cygwin ;-).
Seriously, though. 2.3M files is a serious number of files. I used to have a 180G home directory partition for about 300 users with only about half that number of inodes. Even with rsync, it took about 4 hours to move that guy over to the destination, even when less tha n 500 MB had changed.
It underscores the point that block level replication will have better performance than file level replication, whether you end up using QSM or ndmpcopy. It really is worth it to bang on your rep to get snapmirror for an amount that you can afford. I almost guarantee it would save you orders of magnitude in replication time.
--paul On Wed, 05 Jan 2005 18:19:48 -0800, Ben Rockwood brockwood@homestead-inc.com wrote:
Paul Galjan wrote:
Cool then.
In that case I would look at rsync and/or robocopy (in a windows only env). Not that rsync is a block level protocol (it evaluates on the file level), but perhaps it would provide better performance with smaller backup windows?
Rsync is certainly a possiblity. I'm afraid I'd have some problems being as in this enviroment the filers are being used CIFS only, which makes file level interaction for an old UNIX zealot like me less than entertaining.
To put a better point on it: NDMP is just a wrapper around the UNIX dump command. It's no better, nor worse than it, and that's the reason I asked. The dump command (and NDMP by extension) is for backup, not DR. It is a clunky protocol in terms of straight replication, and that's why Netapp and others offer alternatives for replication...
In the end though, we should get to your problem: how many inodes are we looking at? And what happens in Pass 4, Stage 1? The inode number would be my first suspect.
Right. I haven't looked at the code itself to see exactly what it's doing (I probly should at some point) but Stage 1 of Pass IV seems to be all about inode creation prior to copying in all the data. The source volume has 2.3million inodes in use. That does't seem like an outragious number, and this is a pretty small filer all things considered. How creation of 2.3 million inodes can consume 3 hours is beyond my understanding. During that time the destination filers CPU is nearly idle. The only explanation I can dream up is that the proccess of creating inodes is happening so quickly that the bulk of system time is spent in context switches not in execution, and hense a false sense of idle-ness... but thats a pretty BS explanation since even if that were the case it still wouldn't take 3 hours.
All the evidence I've seen thus far with NDMP suggests that I just need to turn up the flow. Is there an idle loop in the dump code? Thats exactly what it feels like. Anyone know is there is an OnTap equivelent to truss?
benr.
--paul
On Wed, 5 Jan 2005 17:12:25 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Hey Paul.
Because I never said I was building a "quick disaster recovery" solution. :) The recovery system I'm building is more of a "at least we've got another copy" solution. We don't have cash for a nearline which leaves us in a hole. I'm looking to temporarily fill that hole by leveraging old 840's to at least keep a copy of the data on untill we can one day cough up the cash for a proper nearline. I'm wanting to use NDMP perhaps predominantly because this is what it was intended to do. SnapMirror and SnapVault are undoubtably the better solutions, but I'd like to try and utilize NDMP rather than just give up on it as a slow useless system of backup/recovery. If NDMP would just run at the speeds that the filers are capable of I'd be doing ok. I'm leaving Snapmirror/Snapvault off the table for now.
benr.
-----Original Message----- From: Paul Galjan [mailto:galjan@gmail.com] Sent: Wed 1/5/2005 4:52 PM To: Ben Rockwood Cc: toasters@mathworks.com Subject: Re: NDMP Tuning Hi Ben,
I'll be the first to say that this doesn't answer your question, but why are you using NDMP for quick disaster recovery? I would think that SnapMirror or SnapVault would be much more accomodating to DR requirements... My guess would be that a block level copy with VSM would be much more efficient...
I would ask your Sales rep or SE for an eval snapmirror license.
--paul
On Wed, 5 Jan 2005 16:26:32 -0800, Ben Rockwood BRockwood@homestead-inc.com wrote:
Happy New Year Toasters.
Does anyone have experience with tuning NDMP? I'm not sure how much tuning is possible, but I'm trying to work out some serious slowness in NDMP Level 0's.
Plenty of people have had these issues before but I'm not finding solutions on NOW or in forums. Here is a time breakdown of a L0 I did: 5 hr 32 min total 35 minutes in Pass I & II 14 minutes in Pass III 3 hr 38 min Pass IV (Stage 1, Creation) 1 hr 21 min Pass IV (Stage 2, Copy) Unknown in Pass V
These numbers are rough based on timestamps during the NDMPcopy itself. The total transfer is about 58G from one 760 to another. It's the first stage of PassIV that really bothers me. During this first part of the pass there is very low CPU utilization and little IO. I need to speed up the process. Since the destination is a recovery filer and not serving data I don't care if it's CPU gets slammed or IO is pushed through the roof, I just need it done quicker.
Is it throttling or can I some how speed it up? I'm using gig as the interconnect but as I understand it Pass IV Stage 1 is all about inode creation whereas Stage2 is the actual data transfer. The data transfer rate is roughly averaging 11MB/s between the two filers which is less than I'd like to see as well, the filer should be capable of handling a tranfer rate of 30MB/s pretty easily.
Any hints or tips from the experienced? This is effectively a test setup before implementing a recovery system on our production 940's in which we'll be moving nearly 7TB of data. Given my findings so far it's going to be pretty nasty.
benr.