On 15 Feb 2000, Darrell Fuhriman wrote:
I can only realistically expect a peak of 8 to 10 MB/sec from our filers (for some of them, there is only "busy" hours and "really busy"
Is that from experience or theory?
This is from experience, between a production F740 to an idle Ultra2 with fast Ethernet connecting the two. An idle F740 to an idle Ultra2 gets me about 15MB/s of aggregate throughput if I have two 100 Mbps connections. I can't seem to make the filer go any faster than that (one shelf of 18GB drives). It's the same whether I'm running a dump over rsh to /dev/null, or copying large files over NFS.
Does M2 have the streaming issues of DLT? You'll certainly never push that much data into a DLT7000 or AIT. ~10MB/sec with bursts to 12 is the best you'll get. I think M2 is probably too new to know how it really behaves in the real world.
No idea yet, but I asked about that in a new thread. You are right about the M2 track record (in that it doesn't really have one yet). Our current Mammoth drives perform to spec though, so I'm crossing my fingers that Exabyte is equally accurate with their Mammoth2 specs.
If you're doing that, it would seem to me it would make more sense to use a second (perhaps in a cluster) toaster and mirror the volumes using snapmirror.
I considered that, but the additional cost of a filer big enough to handle our projected storage requirements plus SnapMirror licenses for all the Netapps was prohibitive. Attaching a library directly to a filer also limits the choice of tape hardware I can use, and I'm pretty much stuck with DOT's tape dump format.
It's not all that fast (somewhere i have some numbers I ran). If you're syncing lots of smaller files, you'll burn through memory like mad, too.
The tests I did run were against filers with millions of little files (mail store with MailDir-style mailboxes) and a rather deep directory structure. I've found that the Sun Ultra 2 running rsync ran out of juice long before the filer did. The worst case was rsyncing a fresh filesystem, which would have been better accomplished with a straight dump|restore anyway.
I thought they were already quite busy. If they're so busy that you don't feel you can attach a local tape drive, I'm not really understanding how they're unbusy enough to hammer them with a bunch of rsyncs.
I don't want to have a local tape drive on each Netapp regardless. I prefer to have a smaller number of bigger libraries that have a certain flexibility in the number of filers they can accomodate. I did not find that rsync "hammered" the filer more than doing a dump over the network. The overall throughput of dump is between 0% and ~20% faster than rsync, once you're actually into the phase where data is being copied. However, the throughput I'm seeing in either case is not enough to keep an M2 streaming (let alone eight of them). Given that I'd rather not interleave multiple backup streams to one drive, my alternative is to spool to disk first, and then back that up in contiguous chunks to tape.
Frankly, I think you've made the solution much more complicated than need/should be. Backups should be done as simply as possible -- you've added lots of opportunity for things to break, which really don't need to be there.
I don't see it that way... I've simply inserted a large buffer between the Netapps and the tape drives. As long as the filesystem replication stage doesn't collide with the tape backup stage, I should be in the clear. The ability to quickly recover files and filesystems from fast, random-access media rather than slogging through sequential-access tapes really appeals to me too.
Anyone else doing it like this?
I suspect not.
Well, I suppose someone has to go first... ;-)