I'm just a NetApp engineer, so please take what I've got
to say on this subject with the appropriate amount of
salt.
The problem with comparing storage metaphors by analytic
methods is that storage systems are so complex. The disk
drive hardware and firmware, fabric protocols, networking
stacks, OS firmware, disk drivers, block access protocols,
file access protocols and file sharing protocols all interact
in subtle and tricky ways. Discussions about backplane or
fabric speed, I/O bandwidths, seek times and so on never
seem able to take these interactions into account.
Analytic discussions about the systems therefore fail in
many of the same ways that benchmarks fail. They retain
their allure for many people in spite of this.
There's just no substitute, though, for measurement of actual
end-to-end performance in real world situations. Not a
particularly fun realization, because it's expensive and a lot
of work to set up the real world situations and take the
measurements.
It's worth it though. Almost everyone I know who's gone
to the trouble agrees that the results are interesting enough--
and different enough from what even sophisticated analysis
expects--to make it worth the work.
Alan
===============================================================
Alan G. Yoder, Ph.D. agy(a)netapp.com
Network Appliance, Inc.
Sunnyvale, CA 408-822-6919
===============================================================
> -----Original Message-----
> From: Brian Tao [mailto:taob@risc.org]
> Sent: Thursday, August 10, 2000 11:33 AM
> To: Pesce, Nicholas (FUSA)
> Cc: keith(a)netapp.com; toasters(a)mathworks.com
> Subject: RE: Filer storage for databases, seriously? (Was: Re: NetApp
> ques tions)
>
>
> On Wed, 9 Aug 2000, Pesce, Nicholas (FUSA) wrote:
> >
> > I'm sorry Keith. But I've heard this argument before. NFS versus
> > Direct attatch storage? I'm going to have to vote for a good direct
> > attach solution. Why?
> >
> > NFS and CIFS have huge overhead.
>
> Sure, as someone else mentioned, all other things being equal,
> direct-attach storage is faster than network-attached storage. The
> logic (which I've argued before as well) is simple: a NAS box talks
> to its drives as DAS. Thus, the DAS must necessarily be "faster"
> (yeah, a vague term). For example, setting caching aside, it is not
> possible for a filer to pump 50MB/s of data to an NFS client if it can
> only read 30MB/s off its own drives.
>
> However, all things are not equal, at least in benchmarks that are
> possible in the real world. Yes, NFS and CIFS add overhead compared
> to SCSI-over-FibreChannel or what have you. However, that is offset
> by an optimized OS (Data ONTAP), by an efficient filesystem (WAFL), by
> read and write caching, by an optimized TCP/IP stack, etc. If you
> could port all that and run it to DAS, then you might have a fair
> comparison.
>
> > I think I would like to see a test where the Disk sizes and number
> > were similar, I sincerely doubt the Netapp would do as well.
>
> Depends on the application, of course, but I've been surprised
> many times in the past when I thought for sure the Netapp would not be
> able to keep up. I have a 4x450-MHz E420R with a VxVM RAID-0 device,
> spread over 16 50GB 7200 rpm drives on two U2SCSI buses. The server
> also has a Gigabit Ethernet connection to an F740 with one shelf of
> 36GB 10000 rpm drives (5 data, 1 parity, 1 spare). The local
> filesystem is vxfs, mounted with delaylog and the largest allowable
> log area.
>
> I ran a few filesystem replication and backup/restore tests (this
> is our central tape server). The local filesystem handily beat the
> Netapp doing large sequential reads and writes (120MB/sec vs.
> 22MB/sec)... no surprise there. File deletions were a little closer
> (~2500 unlinks/sec on vxfs, ~2000 unlinks/sec on the Netapp). In all
> other tests, the Netapp was as fast or faster (sometimes by a large
> margin) than local filesystem. The Netapp seems to especially shine
> when you have multiple processes reading and writing to all points on
> the filesystem. vxfs does not appear to handle it as gracefully with
> dozens or hundreds of concurrent access requests.
>
> I re-ran some of the same tests with a Veritas RAID-5 volume (to
> be fair to the Netapp), but I stopped after the first couple. There
> is no contest at that point. Veritas software RAID-5 is dog-slow (I
> think I saw bursts of 8MB/sec sequential writes). Turn on a Veritas
> snapshot, and writes to the snapped filesystem go even further into
> the toilet. The performance degradation is cumulative with the number
> of snapshots. There is no such penalty on the Netapp.
>
> One caveat I should mention, since it bit us in the past: file
> locking performance. We have one application that, when running on
> the same type of hardware as above (E420R with those drives), spews
> forth 150,000 syscalls per second, according to Solaris' "vmstat".
> 80% of those calls are fcntl() locks/unlocks to various database files
> on disk. Poor programming practice aside, this application runs very
> slowly over NFS. It simply cannot match in-kernel file locking when
> you're dealing with a local filesystem. Besides that one exceptional
> application, we run Netapps for everything else (including Oracle).
> --
> Brian Tao (BT300, taob(a)risc.org)
> "Though this be madness, yet there is method in't"
>