Centralized backup of multiple filers - toasters

16 Feb 2000


      This is my plan, after having debated the merits of distributed
tape libraries on each filer vs. centralized tape library with network
backup.  I've posted separately to both the toasters and bigbackup
mailing lists (even though I figure most people on the second list are
also on the first).
- backup clients are 12 filers (mostly F740's), each with multiple 100
  Mbps Ethernet interfaces
- backup servers are 2 Sun E420R's with enough CPU, memory, U2SCSI and
  Gigabit interfaces to keep things humming
- each filer has 1 or 2 100 Mbps interfaces plugged into a switch,
  with the backup servers on Gigabit (probably something like a
  Catalyst 3524XL: 24 10/100 + 2 Gigabit)
- each backup server will have four U2SCSI channels or two FC-AL
  loops, initially with half a terabyte of local disk and an Exabyte
  X80 library with 4 Mammoth2 drives (expandable to 8)
- stage 1 backup:  filesystems on all the filers will be replicated to
  the tape servers' local drives (probably rsync over NFS)
- stage 2 backup:  local filesystems are streamed to tape
This seems to work around most of the "problems" associated with
backing up directly to tape, with a few extra side benefits thrown in.
I can only realistically expect a peak of 8 to 10 MB/sec from our
filers (for some of them, there is only "busy" hours and "really busy"
hours).  That's not enough to keep the tape drives streaming and
happy.  To do that, I'd have to multiplex backup streams to a single
tape, and I always thought that was a bad idea.
Hard drives, of course, have no "streaming" issues.  They'll take
the data however fast or slow the Netapps can send them.  Once the
Netapp filesystems have been replicated to local disk, you blast them
out to tape.  With compression turned on, I figure I'll need about
20MB/sec per tape drive to keep them chugging along.  Less
shoeshining, less wear-and-tear on the media, longer tape drive MTBF.
Since all the filer filesystems are consolidated on local storage,
you can slice-n-dice your backup sets to fit whatever drive/tape/time
constraints you may have.  This also gives you a nearline copy of all
your data.  Combined with the Netapp's snapshots, I should never ever
have to go to tape to retrieve a current generation copy of a file
that was accidentally deleted or corrupted.  Disaster recovery of a
downed filesystem can also come off local disk instead of tape.
If you use commercial tape backup software, you don't have to
worry about buying and maintaining licenses for all the Netapps:  all
the software sees is one server backing up its own drives to a tape
stacker.  This may result in savings greater than the cost of the
local drive storage.
I haven't had an opportunity to really test out how fast rsync
works over NFS with the particular hardware setup described above, so
that's the weak link.  If the results from trial runs on a non-
dedicated Ultra2 can be scaled up to a quad CPU E420R, I don't think
there will be a problem.  Multiple rsyncs can be fired up concurrently
to keep the filers busy.  For the amount of data we have (300GB at
present), I expect the tape drives will only be busy for about an hour
doing a weekly full backup, and only a few minutes each day for
differentials.
Anyone else doing it like this?
-- 
Brian Tao (BT300, taob@risc.org)
"Though this be madness, yet there is method in't"