I'm not sure this is related, but if you do a nfsstat -m on the client, does it show the rsize to be 8192 or 32768? When reading data from our NetApps using our E450's we don't see anywhere near the same performance as when we read data from our E450's with our E450's using NFS3 over udp. I've always attributed this to the client code forcing an rsize of 8192 when connecting to the filler, and allowing a rsize of 32768 when connecting to anything else (Sun, Dec, SGI). Has anyone else seen this?
The difference grew even more noticeable when we moved the traffic from a dedicated 100 Mb FX crossover to a 1000 Mb FX switched environment. Prior to switch, numbers like yours for Sun from NetApps reads, ~11 MB/Sec sun from Sun reads, after the switch Sun from NetApp ~18MB/sec, Sun from Sun ~50 MB/sec.
The NetApp server is a 740 with 6 shelves, and the Sun's are E450's 4/400 MHz with an A5200 array, 1+0 configuration.
-----Original Message----- From: Val Bercovici (NetApp) [mailto:valb@netapp.com] Sent: Thursday, April 01, 1999 9:41 AM To: Brian Tao Cc: toasters@mathworks.com; sirbruce@ix.netcom.com Subject: RE: Slow sequential disk reads on F740
What sort of guidelines do people follow when deciding on the
minra setting? At one end of the spectrum, you have broadcast video streaming servers that deal exclusively with long, sequential reads. At the other end, you have something like an INN server storing articles in individual files. Are there any tools to help decide which setting is best, or do you just "eyeball" it and try both settings and see which one seems better?
My simple rule is minra=on for crazy apps like INN what have tons of small totally random I/O's and minra=off for almost anything else. Basically, if you think caching will help you in any way you want minra off.
A single drive should be able to sustain 8MB/sec or higher just on
its own... a stripe of five drives should hit at least 40MB/sec. sysstat on that filer does in fact report 38 to 43MB/sec disk reads during a "vol scrub".
Actually, the 9GB SCSI drives I believe you're using are only rated for a max (not sustained) external transfer rate of 5MB/sec, so the numbers you're seeing really make sense to me. FYI, the max external transfer (not sustained) rate for our 18GB FC-AL drives is a much nicer 12.5MB/sec so you would probably see better (although not necessarily double <g>) sequential performance with those....
-Val. ============================================== Val Bercovici Office: (613)724-8674 Systems Engineer Pager: (800)566-1751 Network Appliance valb@netapp.com Ottawa, Canada FAST,SIMPLE,RELIABLE ==============================================
On Thu, 1 Apr 1999, Brian Meek wrote:
I'm not sure this is related, but if you do a nfsstat -m on the client, does it show the rsize to be 8192 or 32768?
Interesting... I was forcing rsize and wsize to 32768 on the mount command line just to be pedantic, but this is what I'm seeing (output edited for brevity):
# mount -o vers=3,proto=udp,rsize=32768,wsize=32768 e2.j:/ /j # mount /j on e2.j:/ vers=3/proto=udp/rsize=32768/wsize=32768/remote on Thu Apr 1 16:51:27 1999 # nfsstat -m /j from e2.j:/ Flags: vers=3,proto=udp,sec=sys,hard,intr,link,symlink,acl,rsize=8192,wsize=8192,retrans=5
When reading data from our NetApps using our E450's we don't see anywhere near the same performance as when we read data from our E450's with our E450's using NFS3 over udp. I've always attributed this to the client code forcing an rsize of 8192 when connecting to the filler, and allowing a rsize of 32768 when connecting to anything else (Sun, Dec, SGI). Has anyone else seen this?
It sounds like the NFS server dictates the parameters. I can connect with 32K rsize/wsize using UDP transport between Solaris servers, but only 8K between a Sun and an Netapp, and 16K between a Sun and a FreeBSD machine acting as the server. TCP transport allows me to use 32K block sizes in any server.
On Thu, 1 Apr 1999, Brian Meek wrote:
I'm not sure this is related, but if you do a nfsstat -m on the client, does it show the rsize to be 8192 or 32768?
Brian Tao and I are exchanging e-mails on this subject, but let me comment on xfer sizes quickly.
1. Until 5.3.1 (forthcoming release) I believe (all preliminary info subject to change:-) we limited UDP transfer sizes to 8KB by default.
We finally tracked down a couple bugs in FDDI - one ours that was being irritated by a vendors FDDI NIC driver bug - that resulted in an interface hang in the face of 32KB UDP transfer sizes.
If you do not have FDDI cards installed in your filer, you can on 4.X and beyond releases, crank the NFS/UDP transfer size to 32768... with the following command:
options nfs.udp.xfersize 32768
Once clients bind to 32KB transfer sizes with a mount, they will always want that transfer size - until they unmount. So if you tweak this line, ADD IT TO YOUR /etc/rc file.
Now, do not enable 32KB transfer sizes in the presence of FDDI/CDDI NICs until the 5.3.1 release or later I suspect.
2. But, let me throw out some caveats.
Some client OS's can bind to large transfer sizes, but their old 100 Mbit/s ethernet hardware is not up to handling read returns of 32KB. So you want to tread lightly if you have some dusty machines sitting around that look like big honking machines (clients), but are actually seriously network challenged.
3. For 100BaseT Ethernet, I would suggest 5.0.1 or beyond to couple to the transfer size tweak a set of ethernet driver performance improvements. The changes primarily affect read performance.
BUT PLEASE!!!! Refer to your support site or contacts for the recommended release for your particular configuration and application!
A lot of you probably know more than me about R releases and such.
4. NFS/TCP is off by default. We see it yields a 10+% drop in aggregate throughput performance compared to NFS/UDP.
Please refer to the following pairs of SFS97 results for numbers on this effect:
http://www.specbench.org/osg/sfs97/results/sfs97-980805-00002.html http://www.specbench.org/osg/sfs97/results/sfs97-980805-00001.html
http://www.specbench.org/osg/sfs97/results/sfs97-981026-00026.html http://www.specbench.org/osg/sfs97/results/sfs97-981026-00025.html
Most vendors have not submitted TCP results?
On a switched, clean LAN, NFS/UDP should be okay to run.
On an unclean LAN (old switches unable to keep up with aggregate loads of more than a few 100BaseT connections, or faulty wiring), you will have performance problems with both NFS/UDP and NFS/TCP. I suggest you resolve any problems in your network instead of hoping that NFS/TCP will save you.
Now, on F540, more so F630 or better filers, you should have no problem seeing 10 - 11 + MB/s sequential reads over NFS from a capable client (Sun Ultra 1's are good minimums) - with 32KB transfer sizes and 5.0.1 or later. If you are on a non-isolated net, and have interference from other clients, you will see confusing results.
It sounds like the NFS server dictates the parameters. I can
connect with 32K rsize/wsize using UDP transport between Solaris servers, but only 8K between a Sun and an Netapp, and 16K between a Sun and a FreeBSD machine acting as the server. TCP transport allows me to use 32K block sizes in any server.
Yeah, we never saw the FDDI hang with NFS/TCP and 32KB transfers so we left that on by default. It was a puzzler.
Questions?
Okay... a brief expansion on the previous e-mail.
1. While 8192 UDP xfer size is default in the imminently poised 5.3 release, the FDDI driver changes to address the hang are in place...
So, if you couple that to point 2 below, you can safely increase the xfer size to 32768 per my previous e-mail (modulo the comments about potentially performance challenged client NICS, where I would expressly worry about SGI Challenge or Challenge XL 100BaseT cards).
2. From my buddy, Devi (on the "new" Sun FDDI card):
In the filer - We did not handle one case where the device stopped transmitting while everything else was fine with it. Whenever we turn on 32K xfer size over UDP, and NPI cards are used in the clients, the FDDI device went into a state where it refused to transmit.
Clients with Cresendo cards talking to servers with NPI cards with old drivers failed similarly.
After a lot of experiments by the QA folks, we identified that NPI FDDI cards with drivers prior to "nf SBus FDDI Driver v2.05" did cause this.
If our customers can make sure that they do not have NPI cards with drivers prior to "nf SBus FDDI Driver v2.05", they can turn on 32K NFS/UDP xfers.
have a beer, beepy