Just to beat a dead horse here, I'd like to make some more comments on the multiplexing issue. I'm in favor, you can ingnore the rest of this if you don't care about this issue and are tired of it.
John
Brian> Multiplexing data streams to tape always felt like a bad idea Brian> to me, although I agree that it is probably the only feasible Brian> way to keep a tape drive streaming if you have slow data Brian> sources. It's just one more thing that could go wrong when you Brian> most need your data. I think NetBackup has a cloning feature Brian> that "de-multiplexes" the streams back into contiguous chunks Brian> though. -- Brian Tao (BT300, taob@risc.org) "Though this be Brian> madness, yet there is method in't"
The reason people use multiplexing is simple. Slow data steam from client(s), tape drive that requires a certain speed of data flow to be efficient.
The options, with no commentary, are:
1. Don't stream. To hell with the drive, to hell with fast backups, just put it all on one chunk on the drive so I can get it back by myself if need be if the data format is something standard.
2. Stage the backups to a local disk first, since they don't care about data flow performance. Then spool from local disk to local tape and get faster speeds. Requires alot of spare local disk. Can get data off tape if the format is something standard
3. Backup multiple streams of data at once to a single tape drive, interleaving data as you go. Now you can't get your data back without the multiplexing software.
Veritas uses both options 1 and 3. If you want fast backups, you have to multiplex (option 3), otherwise through put drops through the floor because you're using option 1. You have to pin the sales-rep down a bit to get them to admit that options 1 & 3 do not go together in NetBackup.
Amanda uses option 2, which helps, but means you need to spend money on disk you can't use for anything else, and which sits around idle alot of the time.
Legato uses option 3, which has worked for me for quite a few years now. There can be problems, and you can get screwed, but a bad tape will make a hash of any of these three options.
Now the big question is, how many people have had to actually grovel through the raw backup tapes created by the software above (or other software) to do a restore? Please don't include anything over three years old in this, since the explosion of data storage over the past three years has been phenominal, and grovelling through a 1gb tape is nothing when compared to searching through a 50Gb DLT 7000 backup.
My feeling is that it's just not done anymore. Mostly because if you depend on having your data around that badly, you will use other methods to insure it's safe keeping, such as:
- some form of RAID to protect against lost of hardware - some form of snapshot backups to protect against accidental deletion. In this case, speed of restore is crucial for one or a few files, generally not a full filesystem. - cloning and moving offsite of backups.
Another consideration is the backup window. If you only have 8 hours to do a backup, and you can't meet that deadline without using multiplexing, what are you going to do?
1. buy more tape drives and/or libraries. Lots of bucks. 2. Interleave the data and let the software worry about pulling what you need off tape later.
Generally, option 2 here is cheaper and works quite well.
Yet another consideration is how do you backup large databases efficiently? With regular filesystems which hold user directories, mail spools, etc, it doesn't matter as much if some of the data is backed up at 1am, and some at 2am. Even if they are making changes. The interelation of the files among themselves isn't that tight.
But with databases the equation changes, you need to lock down the entire database from change to make sure you get a consistent, clean backup. And since you want to minimize this window, multiplexing can be one way to maximize throughput from the system to tape. Or more tape drives, etc.
Online Indexing of saved files:
The next area of backup options is indexing. This is good if you want to browse your savesets and only restore the files you need, instead of the full saveset. Both Legato and NetBackup offer this, Amanda doesn't (well... it might be working now, but I haven't checked recently). The big advantage of indexing is that you can find what you want quickly and only restore what you need. No more need for a big disk to restore the full saveset to, them pull out what you need, then delete the excess.
Legato got a bad rep in their version 4.x software because thier indexing scheme was less than robust. I never had a problem with getting data off good tapes, even if I had to scan the entire tape to rebuild the online indexes. And of course the *size* of these indexes is a pain as well. I wish they had just taken some known good backend DB and used that instead. But with Legato 5.x, it's gotten alot more stable.
Disaster recovery:
If you have a disaster and need to restore alot of data, you have to think of how bad a disaster it is.
- the entire site went up in flames. Unless you have the hardware and procedures in place, it's not going to make much difference whether you need to rebuild the backup server first to restore the data, or if you have to wander through the tapes by hand looking for the data you need. In either case, it's not going to be a simple thing.
- If a major server goes down, but the backup server is fine, what's the problem? You're going to have to restore alot of data and that takes time. Multiplexing might slow down the restore if it's spread across the tape in lots of chunks, but I don't think that single savesets in one section of the tape will make that big a difference.
Again, if it's that important that you stay up, why aren't you using RAID in the first place?
- The legato disaster recover procedure works just fine. You get 15 day enabler codes right out of the box, so you don't have to worry about licenses in the short term. This gives you the time to get back up before having to deal with the backup system licenseing.
Conclusion:
1. I've written way too much here. :] 2. I like multiplexing. 3. Next dead horse beater, step right up!