New subject: F760 cluster write performance

1 Aug 2001


      Your interconnect is going through a switch?  I really
don't think that's supported.  I don't even think it's
Gbit Ethernet.  The new Cluster Interconnect cards are 
fiber, but I don't think they are GigE and they are always
hooked up directly, just like the ServerNet cables were.
Are you sure you are running your interconnects through
a switch?
-- Adam Fox
NetApp Professional Services, NC
adamfox@netapp.com
...
-----Original Message-----
From: Allen Belletti [mailto:abelletti@dmotorworks.com]
Sent: Tuesday, July 31, 2001 7:39 PM
To: toasters@mathworks.com
Subject: F760 cluster write performance
Hello,
We've been running a pair of clustered F760's since about 
February.  Among
other things, we are using them for Oracle database storage, 
with the hosts
being Suns running Solaris 7 or 8.  The interconnect is via 
gigabit Ethernet
through a Cisco 3508 switch.  We're doing NFS service only -- 
no CIFS.  The
volumes in question are between 60 and 80% full, and consist 
of between ten
and 14 drives, either 18G or 36G depending on the volume.  OS 
version is
6.1R1.  NFS mounts are using UDP, version 3, rsize=32768, wsize=32768
(though Netapp has suggested reducing this to 8k).
Recently, we have been running up against the limit of the 
F760's write
performance, or so it seems.  At best, a single (GigE 
connected) host is
able to do 7-8MB/s of sustained sequential write traffic.  
These same hosts
are able to read at greater than 30 MB/sec and sometimes as high as 50
MB/sec if the data is all in cache on the Netapp.
What I'd really like to know is what kind of sustained write 
rates other
folks are seeing in configurations similar to this one.  At 
the very least,
if you have any kind of F760 cluster, even without gigabit 
Ethernet, are you
able to do more than 7-8MB sustained write?
Also, when the writes are occurring, the filer CPU load is 
generally very
high, anywhere from 80 to 100%.  Disk utilitization is more 
reasonable,
around 50% if the filer is not otherwise busy.
My first thought (and Netapp's as well) would be that this is 
some kind of
network problem, perhaps relating to GigE flow control.  
However, if this
were the case I would expect the Filer CPU load to be lower.
If anyone has seen (or fixed!) anything like this, I would 
appreciate any
suggestions or advice.
Thanks in advance,
Allen Belletti
System Administrator
Digital Motorworks
Phone: 512-692-1024
Fax: 512-349-9366
abelletti@digitalmotorworks.com
www.digitalmotorworks.com
-----Original Message-----
From: owner-toasters@mathworks.com
[mailto:owner-toasters@mathworks.com]On Behalf Of Andrew Smith
Sent: Monday, April 23, 2001 12:58 PM
To: toasters@mathworks.com
Subject: dump to remote device via rmt
Hello,
I've been dumping my filer to a remote HP DDS-4 drive on a 
RedHat Linux
machine.  It's been working great.  But now I have more data 
on my filer
than one DDS-4 tape will hold for a level-0 dump.
Are there any issues with spanning more than one tape for a 
dump via rmt?
Here is the output of my dump run:
DUMP: Dumping tape file 1 on /dev/nst0
DUMP: creating "/vol/vol0/../snapshot_for_backup.3" snapshot.
DUMP: Using Full Volume Dump
DUMP: Date of this level 0 dump: Mon Apr 23 10:40:26 2001.
DUMP: Date of last level 0 dump: the epoch.
DUMP: Dumping /vol/vol0/ to dumper
DUMP: mapping (Pass I)[regular files]
DUMP: mapping (Pass II)[directories]
DUMP: estimated 24452615 KB.
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: Mon Apr 23 10:46:54 2001 : We have written 560012 KB.
[... lines removed ...]
DUMP: Mon Apr 23 12:37:04 2001 : We have written 21097444 KB.
DUMP: Remote write failed: RMT bad response from client
DUMP: DUMP IS ABORTED
DUMP: Deleting "/vol/vol0/../snapshot_for_backup.3" snapshot.
Is there a way I can have dump prompt me to change volumes?  I haven't
been able to find much information on the subject.
Thanks!
-Andrew Smith
 DCANet

RE: F760 cluster write performance