Hello all,
We have a F720 filer where the data from one volume needs to be copied to another volume in the same filer. From the previous discussions on this mailing list, it appears that NDMPcopy 'infinite incremental' feature is the solution which provides the shortest possible downtime. What I would like to ask more experienced NetApp admins is whether there are known caveats in issuing something like:
jndmpcopy filer:/vol/vol1 filer:/vol/vol2 -sa root:<passwd> -da root:<passwd> -level i
Data ONTAP version is 6.01R1, and there is ~ 43 GB of data that needs to be moved. Any feedback on positive/negative experiences with this procedure is highly appreciated.
Robert
I've not had the best experience with jndmpcopy, but don't let me scare you away from it. You might want to check with NetApp that they "support" it - they definitely support the compiled version (plain old ndmpcopy).
Here are some caveats of using "level i" (more in previous post http://teaparty.mathworks.com:1999/toasters/8490.html):
o "Level i" copies will *not* work with level 0-9 copies of the same source and destination. That is, you cannot do a "-level 0" ndmpcopy to move data initially, then use "-level i" to make updates. For that matter, you can't use any other method to do the initial copy such as "vol copy", dump/restore, rsync, etc. If you want to use "-level i" at all, you must use it to do the initial copy in addition to all updates (until you want to stop using ndmpcopy for updates).
o Don't modify the destination until you're done ndmpcopy'ing to it. Once the destination is modified, another "level i" will likely *not* be able to update that destination.
o Be aware of "level i" entries in /etc/dumpdates! Don't delete a "level i" entry for a source if you still want to be able to do further updates to the same destination. Conversely, if you're done using this method to move a directory, and want to use it again on the same source (or a source which eventually has the same name) to a different (new, blank) destination, you'll have to remove the "level i" entry out of /etc/dumpdates for the source directory. Also, don't try to perform "level i" copies of the same source to multiple destinations.
Also be aware that if you are copying the entire vol1 to vol2, snapmirror is definitely going to be faster than incremental ndmpcopy. Not sure if you have a snapmirror license, but assuming the deltas on the source volume aren't enormous, the final snapmirror sync can often take only a few minutes.
Further, if you upgrade the filer to ONTAP 6.1 and above, you can use the snapmirror migrate command which will auto-magically make the "final sync", move all open NFS file handles from the source to the destination and bring the destination on-line. I've not seen it in action, but the idea is a fairly seamless move NFS clients, even with open file-handles. To be totally seamless, you'd probably need to quickly propagate new mount maps/tabfiles to the clients, but it might be worth checking out! (NOTE this feature does not apply to CIFS - sorry!)
-- Jeff
-- ---------------------------------------------------------------------------- Jeff Krueger, NetApp CA E-Mail: jeff@qualcomm.com Senior Engineer Phone: 858-651-6709 NetApp Filers / UNIX Infrastructure Fax: 858-651-6627 QUALCOMM, Inc. IT Engineering Web: www.qualcomm.com
On Wed, Jun 06, 2001 at 02:27:02PM -0700, Robert.Sabo@ecomm.bc.ca wrote:
Hello all,
We have a F720 filer where the data from one volume needs to be copied to another volume in the same filer. From the previous discussions on this mailing list, it appears that NDMPcopy 'infinite incremental' feature is the solution which provides the shortest possible downtime. What I would like to ask more experienced NetApp admins is whether there are known caveats in issuing something like:
jndmpcopy filer:/vol/vol1 filer:/vol/vol2 -sa root:<passwd> -da root:<passwd> -level i
Data ONTAP version is 6.01R1, and there is ~ 43 GB of data that needs to be moved. Any feedback on positive/negative experiences with this procedure is highly appreciated.
Robert
. . . mailing list, it appears that NDMPcopy 'infinite incremental' feature is the solution which provides the shortest possible downtime. What I would like to ask more experienced NetApp admins is whether there are known caveats in issuing something like:
jndmpcopy filer:/vol/vol1 filer:/vol/vol2 -sa root:<passwd> -da root:<passwd> -level i . . .
We had a glitch using this procedure in an upgrade last weekend (copying from an old filer running 5.3.6R2 to a new one running 6.1R1).
We did the level-0 Thursday night, with a level-1 during a Saturday a.m. downtime. The level-0's all claimed to complete without error, but a few of the incrementals based on those level-0's failed to restore, complaining that the full-restore had not completed & thus an incremental restore could not proceed.
We worked around the problem by doing a manual level-1 dump to some alternate disk storage, and then a manual restore from the level-1 images done manually. Note that this glitch didn't affect all of the qtrees we dumped, but we didn't have time to go back and do full level-0's of the ones that failed.
The ramifications of the workaround were that the manual restore of the incremental wasn't able to record the state of any deletions (or renames) which had occurred since the corresponding level-0's, so any files that got deleted on the old filer during Friday, remained on the new filer. As compensation, I ran "rsync -n -a --delete" to compare the old filer against the new one after we were all done, and produced a list of files which appeared on the new filer but not on the old (note the "-n", which has rsync tell what it would do, but not actually do anything).
I've yet to do some post-processing on the list of those files to see which ones were actually deleted during the time period of the level-1. And after the dust settles here, I'll post a bug-report to NetApp about the jndmpcopy/restore problem. On inspection of the logs of the original level-0 runs, I see that the ones that turned up problems at level-1 time all had the "DUMP IS DONE" message, but not a corresponding "RESTORE IS DONE".
Now I had noticed this idiosyncracy ahead of time, but I did a couple checks using rsync and verified that the restores had actually completed just fine. The problem filesystems even had what looked like reasonable "restore_symboltable" files left for the level-1 restores to use. I thought the jndmpcopy process had just shutdown before getting all the log messages from the restoring (destination) filer. My guess now is that what actually happened was that for some reason the restore process didn't get some magic checkpoint value written into the restore_symboltable file, so the incremental restore didn't trust it.
Anyway, I'd say go ahead but make sure you see both "DUMP IS DONE" _and_ "RESTORE IS DONE" on all of your copies, or do them over before proceeding to the next incremental.
And the final moral is that any time you move a bunch of data, things can go wrong, so have a fall-back plan in case all doesn't go perfectly. A tool like rsync on a fast client or two can help a lot.
Regards,