On Mon, 20 Apr 1998, Stephen Manley wrote:
Depends. What version of ONTAP are you running right now that you see this behavior?
sorry, good question. 4.3.1D5, and possibly also with 4.3.3.
The basic data I know, Data ONTAP 5.0 runs significantly faster (50%?) than 4.3. Furthermore, it should not crank at 100% CPU for minutes/hours at a time. We've gotten to the point here that people don't really notice an effect when we run dumps, anymore.
Also, if I could ask a survey question of people (if this is not an acceptable forum for this, just flame me ;) How do people out there use dump?
- rsh to local drive.
Currently our only backup possibility, it's working fairly well for now but as I've mentioned previsouly we're looking to seriously load our filers (before getting some more 8) and the backup times are looking daft! Specially as we do full backups every night, and will continue to do so until such time as we have a sterling installation of something like Veritas such that incrementals become manageable and reliable.
- rsh to remote drive.
Why? streaming across the main network is a bad idea (tm) and CIFS put paid to the utility of the technique anyhow.
- NDMP to local drive.
This is what I want to move toward, using tape-drives attached to the filer (ie where the data lives) and control across the network to a Veritas/other control box, where the indexes get kept.
- NDMP to remote drive.
Doh! Only when we've got our dedicated Gigabit Ethernet backup network in place, which current estimates place around the next ice-age.
- I don't.
Inquiring engineering minds are curious to know. :)
Forgive my sarcasm in the above - you wouldn't believe the angst the word 'backup' causes around here at times. Plus I think I got out the cynical side of my bed this morning.
Sm
On 22 Apr 1998, Mark D Simmons wrote:
-> > On Mon, 20 Apr 1998, Stephen Manley wrote: -> > -> > > Depends. What version of ONTAP are you running right now that you see -> > > this behavior? -> > -> > sorry, good question. 4.3.1D5, and possibly also with 4.3.3. -> > -> -> The basic data I know, Data ONTAP 5.0 runs significantly faster (50%?) than 4.3. -> Furthermore, it should not crank at 100% CPU for minutes/hours at a time. -> We've gotten to the point here that people don't really notice an effect -> when we run dumps, anymore. -> -> Also, if I could ask a survey question of people (if this is not an acceptable -> forum for this, just flame me ;) How do people out there use dump? -> -> 1) rsh to local drive. - -Currently our only backup possibility, it's working fairly well for now but as -I've mentioned previsouly we're looking to seriously load our filers (before -getting some more 8) and the backup times are looking daft! Specially as we -do full backups every night, and will continue to do so until such time as we -have a sterling installation of something like Veritas such that incrementals -become manageable and reliable. - -> 2) rsh to remote drive. - -Why? streaming across the main network is a bad idea (tm) and CIFS put paid to -the utility of the technique anyhow.
Because if you have 8 filers, most, potentially all with > 20GB of compressed data, putting an autochanger on each filer is prohibitively expensive. Going the other route and putting, say, a DLT7000 on each filer is also nuts. Cost for one of those around here (toronto, ontario, canada) is something like $12500 plus the cost of a tape monkey to run around and change tapes daily.
BTW, my english parser keeps croaking over that CIFS statement.. Are you referring to the lost of CIFS-specific file information by doing, say, tar's over an NFS mount? If so, why? Doing a netapp-hosted dump pointing at a remote box with rmt and a tape drive perserves all CIFS specific information, afaik.
-> 3) NDMP to local drive. - -This is what I want to move toward, using tape-drives attached to the filer (ie -where the data lives) and control across the network to a Veritas/other control -box, where the indexes get kept. - -> 4) NDMP to remote drive. - -Doh! Only when we've got our dedicated Gigabit Ethernet backup network in place, -which current estimates place around the next ice-age.
Why 1000mb ethernet? I'm not sure what filer and tape hardware you have, but you think you need 128MByte/second throughput??!?
In my tests of doing local vs. remote dumps, from f210 and f230's to a DLT4000 (hosted on a sparc 5 170MHz when remote), I saw about a 15-20% increase in speed using a locally attached tape. Backup time for a 13GB 1.2 million inode dump was about 5.5h with a remote drive over 100t full duplex, to about 4.5h with the same DLT4000 local to the netapp. These were real-world real-data tests.
That's about 688kByte/s up to 822kByte/s.
Restores were something like 6.5h for remote and 5h for local, though I can't remember exactly right now.
Running two jobs to two DLT's on the remote box earned exactly the same 688kByte/second, showing it to be a limitation of the f210's I was testing. A brief test of an f230 reduced both times by about 30 minutes.
Once I replaced the slow cranky DLT shoeshiner with an Exabyte Mammoth that has a much lower penalty for underruns, the time for a backup dropped to match the locally attached DLT. Now if NetApp will come around and start supporting Mammoths local to the f2xx series (last I heard, only the f630 was supported in this config with ontap 5.0), I can get a local attach time.
I don't deny that having tape drives attached to each filer is robust and fast. Our plan here is to do remote backups often to 5 Exabyte Mammoth drives in 3 20 tape autochangers spread across 3 tape hosts and 3 separate server clusters on their own 100t switched networks. Backups are as fast as we can make them and since we are using netapp's dump, if we are in a crunch and when NetApp begins support of the Mammoth locally, we can do nice fast restores local to the netapp. We have a spare external single Mammoth unit reserved for just that emergency.
With this configuration, we hope to implement a scheme where we only have to change the tapes in the libraries once every two weeks. (We only received the tape units over the last 2 or 3 weeks and orchestrating all this with shell scripts is a pain).
The long and the short of all this is: 100t full duplex has way way more bandwidth available than I think my backups will ever hope to consume. The only penalty comes from latency. Doing remote backups permits the assembly of all archiving equipment in a single room closer to the actual people that administer the backups (who reside a floor and almost the width of their building away from the server clusters), making managability and consistency of the backups greater.
------------------------------------------------------------- Dave Cole (DC1110) | dacole@netcom.ca Systems Administrator | dacole@vex.net | office/~dacole/ Netcom Canada | www.vex.net/~dacole/ 905 King Street West, Toronto, M6K 3G9 | phone - 416.341.5801 Toronto, Ontario, Canada, Earth, Sol | fax - 416.341.5725