Ben> I'm looking into various methods for tape backup of our filers.
Actually, you're looking for ways to restore your filers from a backup in a short amount of time. Now, the question is how do you get that data into a format which can be restored quickly? And at no (or little) extra budget I bet. :]
Ben> Due to the large number of files a file level backup would be Ben> significantly slower than a block-level backup method, however Ben> I'm not sure if one exists.
We currently do backups over NFS, we have a couple of file systems with 10-15 million files. Mostly small ones. And yes, they are poorly laid out and configured. But I can't change them either.
Ben> I was hoping someone could clarify.
Me too! It's a perennial issue, and not one that the major vendors adequately respond to in some ways.
Ben> NDMP utilizes dump, which is file-based so that doesn't solve the Ben> performance issues, and from what I can tell SnapVault is only Ben> for disk-to-disk backup, not tape.
There is a tool, which I just found out about, which lets you snapmirror/snapvault to a file, which can then be written to tape. It would be nicer if it could write to tape from the start, but that's just a detail.
Anyway, the question I have for you is how do you restore your data? Are you looking for quick restores of random files? Are you looking for quick restores of large chunks of data?
Do you have data split across volumes/qtrees, or is it more monolithic? I've found great performance gains by doing my backups against multiple qtrees at a time. This does require that you try to get the users to split up their data a bit more, and help out with the management if at all possible.
Ben> Even if I direct connect a tape robot to the filer I'd still be Ben> using dump and the throughput isn't being limited by the network Ben> capacity, so that doesn't help either. I'm hoping there is a Ben> block-level method I'm unaware of. NDMP in my tests has been dog Ben> slow and standard backups via NFS mount don't sound appealing.
Can you give more details about your problem space? Filer, disks, network, file system size and number of files, etc?
Also try doing NDMP backups in parallel, either at the volume or qtree level. Legato (EMC) Networker can now do NDMP streams, with indexes of files, to disk volumes. This means you can run multiple parallel streams at the same time. Once they are done, you could then clone/stage them to tape either sequentially, or in parallel.
I'm starting to really think that disk-to-disk-to-tape is the way to go, but the problem is that people see all that extra disk space sitting around and they try to fill it up. :-) Or ask why they can just have a hundred gigs or so for this new project, we'll cleanup when we're done, honest!
John John Stoffel - Senior Staff Systems Administrator - System LSI Group Toshiba America Electronic Components, Inc. - http://www.toshiba.com/taec john.stoffel@taec.toshiba.com - 508-486-1087