funny you should ask.
we migrated 2Tb from a pair of 630's onto a pair of clustered 760's, and 2 720's
initial setup:
630 "k" vol0 - 4 shelves of 9G -- tools1 vol1 - 2 shelves of 18G -- proj1, proj2, proj3, proj4 vol2 - 2 shelves of 18G -- archive, proj5, proj6, RCS1
630 "b" vol0 - 2 shelves of 18G -- dept1 homes, proj7, proj8, /usr/local vol1 - 2 shelves of 18G -- dept2 homes, legacytools, libraries, RCS2
final setup:
760 "p" vol0 - 2 shelves of 18G -- dept1 homes vol1 - 2 shelves of 18G -- dept2 homes vol2 - 2 shelves of 18G -- libraries
760 "s" vol0 - 2 shelves of 36G -- tools2 (new) vol1 - 2 shelves of 18G -- tools1 vol2 - 2 shelves of 18G -- /usr/local, RCS1, RCS2, archive, legacytools
720 "d" vol0 - 2 shelves of 18G -- proj7,proj5 vol1 - 2 shelves of 18G -- proj2, proj3
720 "i" vol0 - 2 shelves of 18G -- proj8, proj6 vol1 - 2 shelves of 18G -- proj1, proj4
in order to do this move, we had to shutdown the company. all unix machine would have to be rebooted to get rid of the stale NFS mounts because we moved their /usr/local and their tools. time is money, so i was once offered a time frame of 4 hours in which i could do this move. i like it when people make me happy and full of laughter.
we negotiated to 24hrs hours from friday 7pm to saturday 7pm. At 20G/hour, 2T should take about 100 hours. hmmmm.
i used snapmirror extensively for a lot of this. but i couldn't use it for all of the moves. for example, the 720 "i" is getting things put to both vol0 and vol1. snapmirror has to copy vol to vol, so in the above scheme to use snapmirror on the 720 "i" both vol0 and vol1 would be offline and my machine would have no volume available for the root vol.
every target filer would require at least one of it's volumes to be online. that meant that snapmirror was useless for at least one volume of every filer. As snapmirror requires the target volume to be at least the same size as the source volume, i could not borrow a disk from each volume to make a tiny root volume only for later destruction.
sigh.
the tools i used were cpio, snapmirror, ndmpcopy and rsync.
cpio:
As moving a project only meant the interruption of a slice of the company, i used cpio to move as many of those as i was permitted to interrupt. i got two.
"b" vol0 proj7 --> "d" vol0 "b" vol0 proj8 --> "i" vol0
typically i do cpio with two scripts. the first script [1] makes a checkmark file and then copies everything which can take most of a day, depending on the size of the project. the second script[2] grabs what has changed since the start of the first copy and is usually under an hour. this lets me get the downtime for a project to a negotiable amount.
snapmirror:
snapmirror is a vol to vol copy "at the block level". it is a front end to the vol copy command.
the library people needed space and space NOW, so i gave them their vol2 on "p" and marked it as root. this let me snapmirror to "p" vol0 and "p" vol1.
it couldn't "just put the library people on vol0" because vol0 was comprised of 36G drives. because 2 shelves of 18G are smaller than 4 shelves of 9G, we used 2 shelves of 36G for the tools1 qtree. it didn't matter that the usage on tools1 was smaller than what 2 shelves of 18G will hold, it would not fit no matter how hard i pushed. and i pushed hard.
as i had cpio'd proj7 and proj8 to "d" vol0 and "i" vol0 respectively, i would use ndmpcopy to copy proj5 and proj6 and use snapmirror onto "d" vol0 and "i" vol0.
as "s" vol0, was a new qtree for tools, i set that up and could then snapmirror to "s" vol1 and "s" vol2.
"k" vol1 (proj1, proj2, proj3, proj4) was effectively being split into "i" vol0 and "d" vol0 so i snapmirrored "k" vol1 onto both.
some volumes were "mostly" going to one target but snapmirror would drag along some unwanted qtrees. for example, "b" vol0 contained /usr/local and homes but only homes was to end up on "p" vol0. the 8G of /usr/local came along for the ride. i deleted these tagalong qtrees using a parallel removeit script i wrote for this [3]. there is no fast way to delete a qtree on a netapp. pity that.
snapmirror will also happily take ALL of your filer - cpu and Kb/s throughput. i learned the hard way to throttle snapmirror to 3000Kb/s *total* from a 630. ie: if i ran two snapmirrors, each was throttled to 1500Kb/s.
if two snapshot calculations happened to coincide, the cpu would peg at 100% and we got NFS timeouts. because of that, i would turn on snapmirror in the morning for a few iterations, each morning leading up to M-day. you can either arrive at work 3 hours before the rest of the company or you can do it via cron on the admin host.
the snapmirrors i did ended up as: "b" vol0 --> "p" vol0 "b" vol1 --> "p" vol1 "k" vol0 --> "s" vol1 "k" vol2 --> "s" vol2 "k" vol1 --> "d" vol1 "k" vol1 --> "i" vol1
ndmp (or what they didn't tell you behind the school):
only run 4 in parallel.
there is a limit of 6 parallel ndmpd copies on a filer. it's written into the code that way. i learnt the hard way that when the limit is hit, ndmpd just dies horribly.
i learnt the harder way that there are certain situations that will make the limit 4. i only copy 4 in parallel using the below scripts [4][5].
secondly, there is a bug (25649) where ndmp copies will peg the CPU at 100%, but not actually move data. this was fixed in 5.3.6R1P1, but was lost in 5.3.6R2 -- all my target filers ran 5.3.6R1P1 for the migration. the F630s remained at 5.3.4 as we were afraid to move higher due to a risk of invoking the "18G spinup issue" when we upgraded the firmware.
i used level 0 ndmpcopy to move the smaller projects that could be moved once i finished the snapmirrors and put the volumes back online. for example, /usr/local came along with dept1 homes to "p" vol0, but i wanted it on "s" vol2. that target was also a snapmirror target, so when i turned off the snapmirror on "s" vol2, i could ndmpcopy the 8G of /usr/local to the correct place.
i too have had grief when doing level 1 ndmpcopies. not always, but about 50% of the time.
rsync (version 2.4.3 protocol version 24):
"p" vol2 was to go to be a new work area for the library people. as they were already using it, i couldn't snapmirror the old libraries from "b" vol1 to "p" vol2.
as i have had grief when doing level 1 [2,3,4...] ndmpcopies, i tried rsync.
the directory walk in rsync is the killer. unfortunately what i was moving was less of a qtree and more of a qbush so the walk took about 5 hours on an sun E450 (4cpu, 4Gmem) which does a lot of backups at night, but not much during the day.
rsync looked good but in the end failed (more later).
the move (M-day):
6:00pm turn on snapmirror: "s", "p", "i" "d"
7:00pm mark "b" and "k" readonly
7:15pm turn off snapmirror: "s", "p", "i", "d"
remove "d" vol1 proj1 and proj4 remove "i" vol1 proj2 and proj3
start final rsync of libraries from "b" vol1 to "p" vol2
ndmpdcopy proj5 to "i" vol0 ndmpdcopy proj6 to "d" vol0 ndmpdcopy /usr/local to "s" vol2 ndmpdcopy legacytools to "s" vol2 ndmpdcopy RCS1 to "s" vol2
10:00am rewrite automount tables change backup s/w edit exports on new filers edit quotas new filers
1:00pm install new automount tables
1:10pm reboot NIS servers
1:20pm reboot computer room servers
1:30pm reboot company
2:00pm start removing things that "came along" with snapmirror
2:01pm have beer
what went wrong:
rsync, which had been taking about 5 hours to sync the files, decided to go away and not come back during the final run. darn. after 13 hours, it wasn't finished. the next morning i aborted and did a level 0 ndmpcopy.
my parallel removeit [3] script contained a bug. this was bad. very very very bad. i deleted "stuff" from "s" vol2 and since it was a set of parallel "rm -rf" i had no idea what was gone and what was not gone.
using snaprestore, we got the volume back to the nightly.1 snapshot which was taken just as the ndmpcopies had been finishing up. i then restarted the three ndmpcopies to that volume. at about 12:30 everything looked good. we were 1/2 hour ahead of schedule. in fact i had already started deleting things on "s" vol2. it was at that point i noticed problems with the quotas on that volume.
when i started quotas, the 760 complained about duplicate entries. there were none. really. then i looked at the output of quota report.
# rsh s quota report Type ID Volume Tree ... Quota Specifier ----- -------- -------- -------- ... ----------------- tree 1 vol2 local ... /vol/vol2/local tree 2 vol2 legacy ... /vol/vol2/legacy tree 2 vol2 RCS1 ... /vol/vol2/legacy
somehow RCS1 had been associated with the same Quota Specifier as another qtree. all told, there were 5 qtrees all jumbled together like that.
furthermore, the quotas on the "real" qtrees for that location were a sum of a number of the 5 jumbled qtrees. however a "du" reported the correct amount. it was not good a situation.
we never figured out if this was a side effect of snaprestore, the parallel removes, ndmpdcopy, having the snapshot to which i restored contain a "halfdone" ndmpd transfer, the 900% [6] usage of /vol/vol2/.snapshot, the phase of the moon or something else.
the only solution was to completely remove the corrupt qtrees and recopy them. renaming the qtree would not fix it. it would still have that same bad association.
tree 2 vol2 RCS1.bent ... /vol/vol2/legacy
i had to wait for the remove to finish before i could start the copy again.
in order to perhaps speed up things. i copied (for example) RCS1 from the 760 "p" to the 760 "p" hoping that after the delete was done the copy back between 760s would be faster than from the 630 to the 760.
i have never seen so many errors in my life from ndmp. crosslinked inodes is the best i can do to describe it. let's just leave it at that. we gave up on the shortcut and just copied from the 630.
my goodness, my guinness:
we got done within the 24 hours. planning and a selection of tools. was the only way to get this done.
coincidentally, the same weekend i moved 4 shelves of 9G to 4 shelves of 18G. i did level 0, 1, 2 ndmp copies on the Monday, Wednesday, Thursday, planning to do a level 3 on Friday evening as the final move.
the level 2 ndmpcopy (Wednesday) failed miserably on two of the volumes and I had to restart from level 0. do not rely on your ndmpcopy working all the time.
scripts and notes:
all and any script supplied is to be used at the risk of the user. these are provided "as is" expressly with no warrantee and no guarantee. in particular the removeit script (number 3) has been used by myself to cause damage. it has been fixed since then, but still, be careful.
[1]
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
#!/bin/sh # initial cpio copy sample script # touch startfile find proj -depth -xdev | fgrep -v .snapshot | cpio -pdm newplace
# # EOF #
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
[2]
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
#!/bin/sh # script to copy changes since first copy # find proj -newer startfile -depth -xdev | fgrep -v .snapshot | cpio -pdm newplace
# # EOF #
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
[3]
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<------- #!/bin/sh # script to remove a the subdirectories of qtree in parallel # # RUN THIS AT YOUR RISK # # IF YOU USE THIS SCRIPT AND DELETE *ANYTHING* THAT YOU DO NOT WANT # TO DELETE YOU ARE ON YOUR OWN. PMC-SIERRA WILL NOT TAKE ANY # RESPONSIBILITY FOR ANY DAMAGE CAUSED BY THIS SCRIPT. NONE OF # PMC-SIERRA'S EMPLOYEES WILL TAKE ANY RESPONSIBILITY FOR ANY DAMAGE # CAUSED BY THIS SCRIPT. # # YOU HAVE BEEN WARNED #
ECHO=""
if [ ".-test" = ".$1" ] then ECHO=echo shift fi
if [ $# -ne 1 ] then echo usage: $0 [-test] fullpath exit 1 fi
case "$1" in /*) : ;;
*) echo usage: $0 [-test] fullpath exit 1 ;; esac
set -e cd /tmp cd $1 set +e
if [ -z "$ECHO" ] then echo working on `pwd` echo "continue (y/n)?" read junk
case "$junk" in Y|y|Yes|yEs|yeS|yES|YeS|YEs|YES|yes) : ;; *) echo aborting exit 1 ;; esac else echo echo test only on `pwd` echo actions would be as follows: echo fi
for i in * do if [ -d $i ] then echo recursive remove on $i $ECHO rm -rf $i& else $ECHO rm -f $i fi done
if [ -z "$ECHO" ] then echo waiting for any background processes to finish wait echo done $1 fi
cd .. $ECHO rmdir $1 2>/dev/null || (echo ;echo please check for remaining files)
# # EOF # ---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
[4]
I do ndmpcopies in parallel, 4 at a time. to coordinate this, i script it. the actual work call to ndmpcopy is done in the script "moveproject"
as ndmpcopy in a script requires the root password to be put on the command line, i change the root passwd of the filers to something else. the script expects a directory "logs" to exist.
i touch start and finish files, as an indicator of elapsed time and perhaps to use with "find -newer" and cpio if ndmpd fails in a later level. i really don't trust ndmpcopy.
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<------- #!/bin/sh
goodhost=m
if [ $# -ne 4 -a $# -ne 5 ] then echo "usage: $0 src_filer src_project target_filer target_project [level]" exit fi
if [ $# -eq 5 ] then level=$5 else level=0 fi
if [ ` hostname` != "$goodhost" ] then echo i must be run on $goodhost exit fi
remsh $1 ndmpd on remsh $3 ndmpd on
remsh $3 qtree create $4 2> /dev/null
name=`basename $4` echo $name, logged to `pwd`/logs/$name.$level/log mkdir logs/$name.$level
exec > logs/$name.$level/log 2>&1
touch logs/$name.$level/start ./ndmpcopy $1:$2 $3:$4 -level $level -v -sa 'root:PASS' -da 'root:PASS' touch logs/$name.$level/finish
# # EOF #
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
[5]
this is an example script that copies several qtrees from one filer to another using the moveproject[4] script. this example makes the following 6 moves:
neutron:/vol/vol0/tools proton:/vol/vol0/tools neutron:/vol/vol0/proj/proj1 proton:/vol/vol0/proj1 neutron:/vol/vol0/packages proton:/vol/vol0/packages neutron:/vol/vol0/proj/proj2 proton:/vol/vol0/proj2 neutron:/vol/vol0/usr proton:/vol/vol0/usr neutron:/vol/vol0/home proton:/vol/vol0/home
as i only ever run 4 ndmpd copies in parallel, i want to get this done as effeciently as possible. first i picked the 3 largest qtrees: home tools and proj1. i start them running in parallel as background processes.
then i group the remaining 3 and run them one after another, that is, "home" will not start until "proj2" is done and "proj2" will not start until "packages" is done. *but* since i use brackets to group them as a subshell and run that subshell in the background, i end up with 4 ndmp copies running at any point in time.
the final "wait" makes the script hang about until all the background processes are done. i'm sure you get the idea.
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<------- #!/bin/sh
# start 3 in parallel
./moveproject neutron /vol/vol0/tools proton /vol/vol0/tools 0& ./moveproject neutron /proj/proj1 proton /vol/vol0/proj1 0& ./moveproject neutron /vol/vol0/home proton /vol/vol0/home 0&
# run remaining one after another, in parallel with first 3
( ./moveproject neutron /vol/vol0/packages proton /vol/vol0/packages 0 ./moveproject neutron /proj/proj2 proton /vol/vol0/proj2 0 ./moveproject neutron /vol/vol0/usr proton /vol/vol0/usr 0 )&
wait
---cut--here---8<--------8<---cut--here---8<--------8<---cut--here---8<-------
[6] yep. 900% usage on the .snapshot filesystem (snap reserve set to 10%). filer didn't seem to get upset, but maybe it did.
-- email: lance_bailey@pmc-sierra.com box: Lance R. Bailey, unix Administrator vox: +1 604 415 6646 PMC-Sierra, Inc fax: +1 604 415 6151 105-8555 Baxter Place http://www.lydia.org/~zaphod Burnaby BC, V5A 4V7 186,000 mps: It's not only a good idea, it's the law. -- Frank Wu