Here is the scenario:
I have two identically configured F630 filers. Filer A is active and contains about 150 gigs of data (all in one volume). Filer B is brand new and empty. I need to replicate the data from Filer A to Filer B with the minimal amount of downtime for Filer A. I could do a vol copy to move the data, but in the time that it would take to complete, much of the data would have changed and would be out of date. I was thinking of some sort of solution where I could do a vol copy (or something similar) of the data from Filer A to Filer B while Filer A was up and running. Once that was done, I should shut down Filer A and resync the changes that were made to Filer A during the copy to Filer B (thus giving me an exact copy of the data on each netapp). It would be similar to doing a level 0 dump while everything was active, and then a level 1 while they were shut down. The problem though is that I need them to be synched exactly (i.e. files/dirs that get deleted from Filer A during the initial copy would also get deleted from Filer B during the sync phase). Another option might be to do an rdist from Filer A to Filer B of the data while they are active, isolate them from the network once that is done, and re-run rdist to sync of any changes that took place during the initial move. With 150 gigs of data though, this would probably take a ReallyLongTime[tm], thus not being a viable solution.
Has anyone tackled a situation like this before? What methods did you use and in what sort of time frame were you able perform the replication?
Thanks in advance for any feedback.
-alf
--- Anthony Fiarito alf@cp.net Critical Path Operations
I dunno what special tools filers may have to accelerate this job, but if I were gonna tackle it, knowing only what I know, I'd set up the fastest server I could arrange with the fastest, dedicated interfaces I could, to each Netapp. I'd use rsync[1] to replicate the data initially. Then I'd re-run rsync. I'd keep re-running it until its run-time stopped dropping, and I'd expect that to be after only a few cycles. Finally to make it perfectly synch I'd do one last run with the source readonly, so nobody can make changes out from underneath. I've no idea how long that would take, but I'll betcha it wouldn't be a ReallyLongTime[tm]; rsync is not like rdist, it does not do a separate rpc-like handshake for each file, it streams, very effectively.
I wouldn't be surprised if the Fastest way to do this job were with a fast Sun Ultra or maybe a fast DEC Alpha, using etherchannel or maybe gigabit ether to each of the Netapps.
Even if the Netapp has special local software intended to help out the kind of work you are trying to do, I wouldn't be surprised if something like the above weren't faster; even though they may have tried to put in a special hack, the box is really utterly --- brilliantly --- tuned as a network server, so that's the well-trod route to travel. Lots o' bits been down that road.
-Bennett
On Thu, 14 Jan 1999, Bennett Todd wrote:
http://www.samba.org/rsync has some useful info too. Tres cool program!
When I need to this kind of copying, I use "rdist".
Simply mount both servers on the same host, and have rdist copy to the localhost. (The transfer host must have root priviledges on both filer filesystems.
mkdir /a /b mount fa:/ /a mount fb:/ /b rdist -R -c /a localhost:/b
Keep re-running rdist until you are ready to do the final sync.
Of course, this only moves the files. If you are dealing with access control lists, or non-unix filesystems, this won't capture everything. Another thing you may wish to consider, is that the simple rdist command above will also transfer the /etc directory, which is probably NOT what you want. Consider creating a Distfile breaking things up into managable chunks.
Anthony Fiarito wrote:
Here is the scenario:
I have two identically configured F630 filers. Filer A is active and contains about 150 gigs of data (all in one volume). Filer B is brand new and empty. I need to replicate the data from Filer A to Filer B with the minimal amount of downtime for Filer A. I could do a vol copy to move the data, but in the time that it would take to complete, much of the data would have changed and would be out of date. I was thinking of some sort of solution where I could do a vol copy (or something similar) of the data from Filer A to Filer B while Filer A was up and running. Once that was done, I should shut down Filer A and resync the changes that were made to Filer A during the copy to Filer B (thus giving me an exact copy of the data on each netapp). It would be similar to doing a level 0 dump while everything was active, and then a level 1 while they were shut down. The problem though is that I need them to be synched exactly (i.e. files/dirs that get deleted from Filer A during the initial copy would also get deleted from Filer B during the sync phase). Another option might be to do an rdist from Filer A to Filer B of the data while they are active, isolate them from the network once that is done, and re-run rdist to sync of any changes that took place during the initial move. With 150 gigs of data though, this would probably take a ReallyLongTime[tm], thus not being a viable solution.
Has anyone tackled a situation like this before? What methods did you use and in what sort of time frame were you able perform the replication?
Thanks in advance for any feedback.
-alf
Anthony Fiarito alf@cp.net Critical Path Operations
+--- In a previous state of mind, Anthony Fiarito alf@cp.net wrote: | | I have two identically configured F630 filers. Filer A is active and | contains about 150 gigs of data (all in one volume). Filer B is brand | new and empty. I need to replicate the data from Filer A to Filer B | with the minimal amount of downtime for Filer A. I could do a vol | [stufff deleted] | | Has anyone tackled a situation like this before? What methods did you | use and in what sort of time frame were you able perform the | replication?
I have done something similar (but with less data). How big of a delta is there per hour? Are you running snapshots?
ndmpcopy works well (to an extent) for doing this sort of thing, as it updated the dumpdates file on the filer and uses it to do its thing.
I have usually resorted to doing a cpio between filers. Using the gnu version of find, the "older/newer than" timestamp args work correctly, so doing incrementals becomes less painful. Of course, this does not deal with deletes.
rsync is nice (http://samba.anu.edu.au) since it is fast, can use ssh (if you are doing this over the open networks), it deals with changes well and does not send entire files (as rdist does).
I would talk to your NetApp sales rep about this, as they may be able to answer questions I cannot :)
Hope this helps.
Alex