Re: Moving data from one volume to another

7 Jun 2001


      ...
. . .
mailing list, it appears that NDMPcopy 'infinite incremental' feature is the
solution which provides the shortest possible downtime. What I would like to
ask more experienced NetApp admins is whether there are known caveats in
issuing something like:
jndmpcopy filer:/vol/vol1 filer:/vol/vol2 -sa root:<passwd> -da
root:<passwd> -level i
. . .
We had a glitch using this procedure in an upgrade last weekend (copying
from an old filer running 5.3.6R2 to a new one running 6.1R1).
We did the level-0 Thursday night, with a level-1 during a Saturday a.m.
downtime.  The level-0's all claimed to complete without error, but a few
of the incrementals based on those level-0's failed to restore, complaining
that the full-restore had not completed & thus an incremental restore
could not proceed.
We worked around the problem by doing a manual level-1 dump to some
alternate disk storage, and then a manual restore from the level-1
images done manually.  Note that this glitch didn't affect all of
the qtrees we dumped, but we didn't have time to go back and do full
level-0's of the ones that failed.
The ramifications of the workaround were that the manual restore of the
incremental wasn't able to record the state of any deletions (or renames)
which had occurred since the corresponding level-0's, so any files that
got deleted on the old filer during Friday, remained on the new filer.
As compensation, I ran "rsync -n -a --delete" to compare the old filer
against the new one after we were all done, and produced a list of files
which appeared on the new filer but not on the old (note the "-n", which
has rsync tell what it would do, but not actually do anything).
I've yet to do some post-processing on the list of those files to see
which ones were actually deleted during the time period of the level-1.
And after the dust settles here, I'll post a bug-report to NetApp about
the jndmpcopy/restore problem.  On inspection of the logs of the original
level-0 runs, I see that the ones that turned up problems at level-1
time all had the "DUMP IS DONE" message, but not a corresponding
"RESTORE IS DONE".
Now I had noticed this idiosyncracy ahead of time, but I did a couple
checks using rsync and verified that the restores had actually completed
just fine.  The problem filesystems even had what looked like reasonable
"restore_symboltable" files left for the level-1 restores to use.  I thought
the jndmpcopy process had just shutdown before getting all the log messages 
from the restoring (destination) filer.  My guess now is that what actually 
happened was that for some reason the restore process didn't get some magic
checkpoint value written into the restore_symboltable file, so the incremental
restore didn't trust it.
Anyway, I'd say go ahead but make sure you see both "DUMP IS DONE" _and_
"RESTORE IS DONE" on all of your copies, or do them over before proceeding
to the next incremental.
And the final moral is that any time you move a bunch of data, things
can go wrong, so have a fall-back plan in case all doesn't go perfectly.
A tool like rsync on a fast client or two can help a lot.
Regards,
-- 
Marion Hakanson hakanson@cse.ogi.edu
CSE Computing Facilities

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Moving data from one volume to another