I can suggest you to check:
1. Network cards and switches/routers between the Linuxes and the filers - check for 100mbit full duplex settings - check in the filer and the linux for nfsstat and netstat errors. 2. I don't have experience with NFS on linux, but I know that there were a lot of nfs changes in the last kernels, so even if it worked OK before, maybe you should just consider to upgrade the linuxes - it is never bad in linux anyway... :) You'll probably get faster NFS that way, since nfs was moved to kernel level I think in 2.3 or something...
Eyal.
--- Henrique Pantarotto henrique@corp.terra.com.br wrote:
Hello toaster friends!
This week I experienced a very strange NFS performance in our POP3 servers. We've 4 POP3 Linux clients mouting 2 netapp boxes, like this:
/var/spool/mail1 - F760 (cluster) /var/spool/mail2 - F760 (cluster) /var/spool/pop - F720
Our Linux clients are Pentium III 650Mhz Dual with 1GB RAM running a Red Hat-like distribution with Kernel 2.2.14-19smp.
This solution has worked very fine until this week, when we started experiencing high Load Average in our Linux boxes and accumulation of pop processes.
Copies from /var/spool/mail to /var/spool/pop are verrrrrrrrry slowwwwwwwwww... so pop processes get status of "D" (Disk Waiting right?, and caos takes place.
I don't know what it is. We're crazy here...
Anyone has ever had hard times like this before with Linux?
Thanks a lot for any help!!!!!!!
PS: We've Brazilian NetApp guys giving us a hand here.
Regards, _______________________________________________ Henrique Pantarotto SysOp Site S�o Paulo Terra Networks Brasil S/A A Internet mais sua do que nunca Tel: (11) 5505-5728 r.316/238 ICQ: 6934285 IT: henpa henrique@corp.terra.com.br
===== Yours, Eyal Traitel eTraitel@yahoo.com, Home: 972-3-5290415 (Tel Aviv) *** eTraitel - it's the new eBuzzword ! ***
__________________________________________________ Do You Yahoo!? Yahoo! Mail � Free email you can access from anywhere! http://mail.yahoo.com/
Eyal Traitel wrote:
- Network cards and switches/routers between the Linuxes and the
filers - check for 100mbit full duplex settings - check in the filer and the linux for nfsstat and netstat errors.
Grab hold of the mii-diag tool from don beckers site and force all the boxes onto 100baseT-FD, even when the cards claim to be running full duplex we've found that forcing them to FD cleared up a load of problems.
- I don't have experience with NFS on linux, but I know that there
were a lot of nfs changes in the last kernels, so even if it worked OK before, maybe you should just consider to upgrade the linuxes - it is never bad in linux anyway... :) You'll probably get faster NFS that way, since nfs was moved to kernel level I think in 2.3 or something...
The nfs server moved into the kernel as an optional feature, however the clients haven't changed much.
From what we've seen and a few other people have spotted it to is that although the linux nfs implementation is ok linux->linux linux->anything_else performs less well. Also with a single client process linux goes reasonably fast however as soon as you get lots of parallel nfs jobs things start to slow down. We've had some pretty bad problems with machines running happily for months then the amount of nfs traffic hits a threshold and it gets stuck in a downward spiral. For us the problem was so bad we've had to move the majority of things onto local discs mirrored from central servers. Unless someone fixes the problem pretty soon our strategy is going to move away from filers everywhere to just a couple of filers at the centre of the network. Anyone else think it would be worth netapp sponsoring someone to fix up the linux nfs client to perform well under load? I'm guessing that the cost-benefit to netapp would be pretty convinving.
Chris
On Sat, Aug 12, 2000 at 11:17:19AM -0000, Chris Good wrote:
going to move away from filers everywhere to just a couple of filers at the centre of the network. Anyone else think it would be worth netapp sponsoring someone to fix up the linux nfs client to perform well under load? I'm guessing that the cost-benefit to netapp would be pretty convinving.
I don't really follow Linux developments, but I understood that SGI have been doing a lot of NFS improvement work. Have they only been doing server-related fixes?
The biggest problem with x86 "free" Unix/alikes had always been locking, so what is the current state of play with that in Linux, Free/Net/OpenBSD?
James.
Not to bash Linux too much here, but you should be willing to accept such problems with a public OS. I'm sure the Netapp people will tell you they can fix a problem overnight for you since the source is open... so, why haven't they? Since it seems they have trouble coming up with good NFS (a problem since the very beginning), maybe you should consider moving to Solaris or some other commercial OS.
Bruce
On Sat, Aug 12, 2000 at 01:54:53AM -0700, Eyal Traitel wrote:
I can suggest you to check:
- Network cards and switches/routers between the Linuxes and the filers - check for 100mbit full duplex settings - check in the filer and the linux for nfsstat and netstat errors.
Good plan. It sounds to me like either an NFS tuning issue or the dual-processor, dual-ethernet locking problems that caused the Mindcraft benchmarks to be so horrible. You might actually get better performance under 2.2 if you dump the extra processor and buy a couple more servers (relatively cheap if you're going with, say, VA/Linux).
- I don't have experience with NFS on linux, but I know that there were a lot of nfs changes in the last kernels, so even if it worked OK before, maybe you should just consider to upgrade the linuxes - it is never bad in linux anyway... :) You'll probably get faster NFS that way, since nfs was moved to kernel level I think in 2.3 or something...
Ok, there's actually a kernel (no pun intended) of good advice, here, but there's also a lot of danger in what you're saying. Here's the detail:
The 2.3 series is an UNSTABLE development series. It is intended for kernel developers and those who want to test out new code only. The 2.4-test series is the beta for the initial roll-out of 2.4, the next stable kernel series based on 2.3.
If what you're running into is SMP-related you could very likely benifit from using 2.4, but IT IS ONLY IN BETA, so you would be crawling out on quite a limb here. At the very least, you would need to do some serious Q/A before deploying.
That said, your best first step here is to see if you have the same problem with one processor. If you don't, you know you're running into the SMP/ethernet locking problem and that your only options are single-processor or 2.4. Check the kernel mailing list and see if I'm wrong. There might be other ways to address this....
Hi,
1) if you force the linux box to 100tx-fd, don't forget to do this also to the switch. 2) netstat -i doesn't report resonable interface statistics. Use cat /proc/net/dev instead. 3) You !really! want to try the NFSv2+dhiggen patches found at nfs.sourceforge.net. There is also a ready compiled rpm for 2.2.16+dhiggen+NFSV3+ext3 which works very good for me (i got 22MB/s continous writing to a F760 over GBE from a big linux box). Don't forget to upgrade the nfs-utils as you go.
*** Good linux NFS client code exists, it is just not in the standard kernel yet. nfs.sourceforge.net is the site you want to check ***
Cheers ... Oliver
-----Original Message----- From: owner-dl-toasters@netapp.com [mailto:owner-dl-toasters@netapp.com]On Behalf Of Eyal Traitel Sent: Samstag, 12. August 2000 10:55 To: Henrique Pantarotto; toasters@mathworks.com Subject: Re: Bad NFS performance with Linux.. please help
I can suggest you to check:
- Network cards and switches/routers between the Linuxes and the filers - check for 100mbit full duplex settings - check in the filer and the linux for nfsstat and netstat errors.
- I don't have experience with NFS on linux, but I know that there were a lot of nfs changes in the last kernels, so even if it worked OK before, maybe you should just consider to upgrade the linuxes - it is never bad in linux anyway... :) You'll probably get faster NFS that way, since nfs was moved to kernel level I think in 2.3 or something...
Eyal.
--- Henrique Pantarotto henrique@corp.terra.com.br wrote:
Hello toaster friends!
This week I experienced a very strange NFS performance in our POP3 servers. We've 4 POP3 Linux clients mouting 2 netapp boxes, like this:
/var/spool/mail1 - F760 (cluster) /var/spool/mail2 - F760 (cluster) /var/spool/pop - F720
Our Linux clients are Pentium III 650Mhz Dual with 1GB RAM running a Red Hat-like distribution with Kernel 2.2.14-19smp.
This solution has worked very fine until this week, when we started experiencing high Load Average in our Linux boxes and accumulation of pop processes.
Copies from /var/spool/mail to /var/spool/pop are verrrrrrrrry slowwwwwwwwww... so pop processes get status of "D" (Disk Waiting right?, and caos takes place.
I don't know what it is. We're crazy here...
Anyone has ever had hard times like this before with Linux?
Thanks a lot for any help!!!!!!!
PS: We've Brazilian NetApp guys giving us a hand here.
Regards, _______________________________________________ Henrique Pantarotto SysOp Site Sco Paulo Terra Networks Brasil S/A A Internet mais sua do que nunca Tel: (11) 5505-5728 r.316/238 ICQ: 6934285 IT: henpa henrique@corp.terra.com.br
===== Yours, Eyal Traitel eTraitel@yahoo.com, Home: 972-3-5290415 (Tel Aviv) *** eTraitel - it's the new eBuzzword ! ***
Do You Yahoo!? Yahoo! Mail Free email you can access from anywhere! http://mail.yahoo.com/