To test the scsi reset / filer reboot theory:
I started a ndmp backup to a scsi attached tape / library. Once data was being written to tape, I disconnected the scsi cable. The backup aborted, and the scsi bus did reset. The filer (F760 ONTAP 5.3.5) did not reboot. The NFS and CIFS operations were not impacted.
Thanks, Bill Roth
-----Original Message----- From: Steve Kappel [mailto:steve.kappel@raistlin.min.ov.com] Sent: Thursday, March 16, 2000 8:04 AM To: bryer@sfu.ca Cc: toasters@mathworks.com Subject: Re: NetApp backup recommendations
NetBackup (Veritas) can only dump to tape drives connected directly to an NDMP server (i.e. only to a locally-attached NetApp drive, or to a remote NetApp with attached tape). It cannot do NDMP backups to a "normal" NetBackup media server. I've been told that NetBackup can only restore NDMP dumps in place, too -- not to an alternate directory, for example (I haven't verified that with the vendor).
Hmm this could be deadly for us. Legato recommends direct attached tape drives which I was preferring and it makes sense for performance.
Directly attached drives are more efficient. Even with plenty of network bandwidth, backing up over the network still puts a significant CPU load on the boxes.
If the filers are physically within reach, a large library can have some of its drives split off onto separate SCSI buses and one bus attached to each filer. "normal" NetBackup media servers can also have some of the drives. Robotic control can be anywhere in such a config. This is really the best of all worlds (IMHO) as you get direct attach performance on all boxes but you are sharing the library.
If they are not physically within reach then 3-way backup to another filer can be used. Large sites may dedicate a NetApp as a "tape server". I believe there are library vendors that are looking at supporting NDMP directly in their libraries.
But we got discussing this issue, and the idea of the NetApp spontaneously rebooting due to a SCSI bus error (from the tape drive) came up. For our new NetApp we cannot afford to have this happen. It's not so critical for our existing filer. Our existing filer runs OnTap 5.2.1 and has rebooted once (in approx 7 months) due to a SCSI bus error (a tape got stuck in the jukebox and the filer booted to try to clear the error). Although with newer revs of OnTap and the firmware, maybe this isn't an issue any more?
I have never seen this with the NetApp's we have in NetBackup development. I see plenty of bus resets but never a reboot.
__________________________________________________________________________ Steve Kappel steve.kappel@veritas.com VERITAS Software steve.kappel@iname.com (Personal)
To test the scsi reset / filer reboot theory:
I started a ndmp backup to a scsi attached tape / library. Once data was being written to tape, I disconnected the scsi cable. The backup aborted, and the scsi bus did reset. The filer (F760 ONTAP 5.3.5) did not reboot. The NFS and CIFS operations were not impacted.
Thanks, Bill Roth
This was about 4 months ago now. Let's see if I can remember the sequence of events accurately ...
F740 running OnTap 5.2.1, DLT4700 stacker direct SCSI attached, using BudTool 4.6. At 6:00 we had a backup scheduled to run. The tape was requested to load and but got jammed when it started to latch and spool the tape into the drive. BudTool continued to retry for approx 4 hours, the filer was reporting SCSI bus errors. At approx 10:00 the filer rebooted (on it's own) to try to clear the bus error. The error wasn't cleared. Around 14:00 we had a replacement stacker and swapped them hot. (Then we got to dismantle the stacker to try to get the tape out.)
At the time I can remember we were thinking 'Cool. The NetApp rebooted, resulting in 90s (or so) downtime and no one even noticed the interruption in service.' Not one complaint to our help desk. Would have been a different story if we were still using the Auspex.
So the circumstances are different than your test case, but OnTap 5.3.5 might be better at handling these sorts of errors.
For our new filer we want to avoid any reboots of this sort due to tape drive SCSI errors (for the existing filer it's less important). For performance reasons I would prefer to do direct SCSI attach of the tape drives, but on the way in today I was thinking we might be able to hang all the tape drives off our existing filer and backup up the new filer that way (once I figure out the security implications).
Guys,
Is there an 'ndmpd' that we could run on a UNIX host? Then you wouldn't have to pick a filer to be your main backup server. (Does that make sense?)
Ed
--
-----Original Message----- From: Steve Kappel [mailto:steve.kappel@raistlin.min.ov.com] Sent: Thursday, March 16, 2000 8:04 AM To: bryer@sfu.ca Cc: toasters@mathworks.com Subject: Re: NetApp backup recommendations
[...]
If they are not physically within reach then 3-way backup to another filer can be used. Large sites may dedicate a NetApp as a "tape server". I believe there are library vendors that are looking at supporting NDMP directly in their libraries.
[...]
On Thu, 16 Mar 2000, Edward Henigin wrote:
Is there an 'ndmpd' that we could run on a UNIX host? Then you wouldn't have to pick a filer to be your main backup server. (Does that make sense?)
Check ftp.ndmp.org... the NDMP SDK contains reference source for a UNIX ndmpd, although I cannot vouch of its usefulness as a tape media target in a backup situation. I've only used the ndmpcopy client.