RE: Filer storage for databases, seriously? (Was: Re: NetApp ques tions)

10 Aug 2000

      You're right about one thing... all things are not equal.  You 
raised some good points along the way... see below for my 
annotations.
...
-----Original Message-----
From: Brian Tao [mailto:taob@risc.org]
Sent: Thursday, August 10, 2000 2:33 PM
To: Pesce, Nicholas (FUSA)
Cc: keith@netapp.com; toasters@mathworks.com
Subject: RE: Filer storage for databases, seriously? (Was: Re: NetApp
ques tions)
On Wed, 9 Aug 2000, Pesce, Nicholas (FUSA) wrote:
...
I'm sorry Keith.  But I've heard this argument before.  NFS versus
Direct attatch storage? I'm going to have to vote for a good direct
attach solution. Why?
NFS and CIFS have huge overhead.
Sure, as someone else mentioned, all other things being equal,

direct-attach storage is faster than network-attached storage.  The
logic (which I've argued before as well) is simple: a NAS box talks
to its drives as DAS.  Thus, the DAS must necessarily be "faster"
(yeah, a vague term).  For example, setting caching aside, it is not
possible for a filer to pump 50MB/s of data to an NFS client if it can
only read 30MB/s off its own drives.
One thing to remember in regards to caching... when you request a block 
on disk, your local OS (with direct attach storage) will read that block
and read ahead several more blocks.  Those read-ahead blocks will be 
transferred via your fibre channel, even if they aren't used by your 
application.  The filer on the other hand will do the same read-ahead,
but it will only transmit the block that was requested.  If the application
asks for the next block, then both have that block in memory.  Now,
in a database environment where blocks are very randomly placed within
a filesystem, this read-ahead via direct attach storage is going to kill the
efficiency of the bandwidth to your storage, and that's your OS's fault.  
Whereas with a filer, you can turn off read-ahead within the filer to minimize
the work it does.
...
However, all things are not equal, at least in benchmarks that are

possible in the real world.  Yes, NFS and CIFS add overhead compared
to SCSI-over-FibreChannel or what have you.  However, that is offset
by an optimized OS (Data ONTAP), by an efficient filesystem (WAFL), by
read and write caching, by an optimized TCP/IP stack, etc.  If you
could port all that and run it to DAS, then you might have a fair
comparison.
...
I think I would like to see a test where the Disk sizes and number
were similar, I sincerely doubt the Netapp would do as well.
Depends on the application, of course, but I've been surprised

many times in the past when I thought for sure the Netapp would not be
able to keep up.  I have a 4x450-MHz E420R with a VxVM RAID-0 device,
spread over 16 50GB 7200 rpm drives on two U2SCSI buses.  The server
also has a Gigabit Ethernet connection to an F740 with one shelf of
36GB 10000 rpm drives (5 data, 1 parity, 1 spare).  The local
filesystem is vxfs, mounted with delaylog and the largest allowable
log area.
I ran a few filesystem replication and backup/restore tests (this

is our central tape server).  The local filesystem handily beat the
Netapp doing large sequential reads and writes (120MB/sec vs.
22MB/sec)... no surprise there.  File deletions were a little closer
(~2500 unlinks/sec on vxfs, ~2000 unlinks/sec on the Netapp).  In all
other tests, the Netapp was as fast or faster (sometimes by a large
margin) than local filesystem.  The Netapp seems to especially shine
when you have multiple processes reading and writing to all points on
the filesystem.  vxfs does not appear to handle it as gracefully with
dozens or hundreds of concurrent access requests.
This is an apples to oranges test.
Sure, streaming to/from RAID0 will always kick ass.  However, who 
really runs RAID0 these days???  (I'm sure there's about <.1% of
applications where RAID0 is suitable, because the data is not critical)
RAID 0+1 might be a slightly better comparison since the filer doesn't have
a RAID0 mode.
...
I re-ran some of the same tests with a Veritas RAID-5 volume (to

be fair to the Netapp), but I stopped after the first couple.  There
is no contest at that point.  Veritas software RAID-5 is dog-slow (I
think I saw bursts of 8MB/sec sequential writes).  Turn on a Veritas
snapshot, and writes to the snapped filesystem go even further into
the toilet.  The performance degradation is cumulative with the number
of snapshots.  There is no such penalty on the Netapp.
Ok, this is apples to apples.
...
One caveat I should mention, since it bit us in the past:  file

locking performance.  We have one application that, when running on
the same type of hardware as above (E420R with those drives), spews
forth 150,000 syscalls per second, according to Solaris' "vmstat".
80% of those calls are fcntl() locks/unlocks to various database files
on disk.  Poor programming practice aside, this application runs very
slowly over NFS.  It simply cannot match in-kernel file locking when
you're dealing with a local filesystem.  Besides that one exceptional
application, we run Netapps for everything else (including Oracle).
Ok, given the environment you just described, you could enable a undocumented
feature within the solaris mount_nfs command, 'llock'.  This tells the
NFS client that he shouldn't use NLM to do file locking, I'll just do
it locally.  This will essentially give you in-kernel locking.  The caveat
here is that you can't share the filesystem with other clients, but you 
can't do that with today's direct attach storage either.  I usually recommend
using the llock option in a Solaris/Oracle/filer environment.
Aaron

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

RE: Filer storage for databases, seriously? (Was: Re: NetApp ques tions)