NFS Performance - toasters

13 Sep 2005


      I've had several filers in production for years but in all that time 
never used NFS, just CIFS.  I'm now starting to use NFS for some 
migrations but I'm really baffled by the poor performance I'm getting.  
Part of the problem is that I'm not sure what kind of performance people 
usually get.
When I started pulling data off my 940's I was only getting around 
4-5MB/s.  I'm grabbing data by way of GNU Tar on Solaris.  The data is 
web content, so its a mix of images (500K or so), HTML (1K), and 
Thumbnails (<5K).  No matter what I seem to do I can't boost performance 
above that 5MB/s mark.  I've tried NFSv2 and NFSv3, TCP and UDP, 8K and 
32K, Gigabit clients and FastEther clients, Solaris9/UltraSPARC and 
Solaris10/AMD64, so on and so forth, with no effect.
Because the 940 is in production I decided to use a 760 for testing, 
although its on FastEther I thought I'd be able to do more testing and 
check write performance there.  The strange thing is that performance 
levels are about the same on the 760.  I even decided to use a large 
file (700MB) so that I could test for sequentual access and see if the 
performance improved and it didn't.  Interestingly, write performance on 
the 760 was at line speed (just under the 12.5MB/s limit of FastEther).  
So on the 760 I can write data twice as fast as I can get it.
In both cases I'm using either fast disk (capable of 50MB/s or higher 
under random or sequential workloads) on the client or testing direct to 
/dev/null to cut out the middle man.  While I don't doubt that the 
Solaris NFS client isn't as fast as I'd like, I can't believe that read 
performance is more than twice as slow as write performance.
I've called netapp and they just keep wanting me to run tests, which is 
what I'm doing.  Searching NOW hasn't helped, searching Google hasn't 
helped.  Any ideas?  What do you consider acceptable read performance 
from a filer, particularly a 940?
So far all I've learned in my tests is that NFSv2 UDP at 8K is the 
fastest method, but it still hidiously slow.
Any ideas are greatly appreciated.
benr.