A few RedHat 7.2 nfs clients mounting a F840 volume via automount error out with the following. I have looked into everyting including NIS timeouts, filer sysstat, network performance and everything looks to be fine. Intrestingly all the other non linux Clients accessing the same auto-mount are working perfectly.
Mar 9 06:35:12 linuxclient kernel: nfs: server filer OK Mar 9 06:35:13 linuxclient last message repeated 52 times Mar 9 06:35:31 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:35:41 linuxclient kernel: nfs: server filer OK Mar 9 06:35:54 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:36:34 linuxclient kernel: nfs: server filer OK Mar 9 06:36:46 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:38:36 linuxclient kernel: nfs: task 41929 can't get a request slot Mar 9 06:38:41 linuxclient kernel: nfs: task 43328 can't get a request slot
Hey Premanshu,
There are multitude of things that can cause this
* Very busy "linuxclient"
* Lossy network(do you have a mix 10/100/GigE network)
* Look at your switch and make sure that there are no errors(hardware, fcs etc)
Fixes:
* Switch to tcp on filer for nfs
* Move to RH 7.3 and 2.4.20-ac2 if you can.
What switch are you using and what is the relative position of the linux clients on that switch(same switch, same bridge on the switch etc)
On Mon, 10 Mar 2003, Premanshu Jain wrote:
Date: Mon, 10 Mar 2003 14:25:51 -0800 (PST) From: Premanshu Jain Premanshu.Jain@synopsys.com To: toasters@mathworks.com Subject: Weired Linux error
A few RedHat 7.2 nfs clients mounting a F840 volume via automount error out with the following. I have looked into everyting including NIS timeouts, filer sysstat, network performance and everything looks to be fine. Intrestingly all the other non linux Clients accessing the same auto-mount are working perfectly.
Mar 9 06:35:12 linuxclient kernel: nfs: server filer OK Mar 9 06:35:13 linuxclient last message repeated 52 times Mar 9 06:35:31 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:35:41 linuxclient kernel: nfs: server filer OK Mar 9 06:35:54 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:36:34 linuxclient kernel: nfs: server filer OK Mar 9 06:36:46 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:38:36 linuxclient kernel: nfs: task 41929 can't get a request slot Mar 9 06:38:41 linuxclient kernel: nfs: task 43328 can't get a request slot
/dev/null
devnull@adc.idt.com
A few RedHat 7.2 nfs clients mounting a F840 volume via automount error out with the following. I have looked into everyting including NIS timeouts, filer sysstat, network performance and everything looks to be fine. Intrestingly all the other non linux Clients accessing the same auto-mount are working perfectly.
Mar 9 06:35:12 linuxclient kernel: nfs: server filer OK Mar 9 06:35:13 linuxclient last message repeated 52 times Mar 9 06:35:31 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:35:41 linuxclient kernel: nfs: server filer OK Mar 9 06:35:54 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:36:34 linuxclient kernel: nfs: server filer OK Mar 9 06:36:46 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:38:36 linuxclient kernel: nfs: task 41929 can't get a request slot Mar 9 06:38:41 linuxclient kernel: nfs: task 43328 can't get a request slot
Be sure you are mounting with the "tcp" flag. We have found that a Linux NFS v3 client using udp has this problem, but it goes away if you use tcp. I think Linux defaults to udp.
You may need to enable tcp on the filer with
options nfs.tcp.enable on
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support
A couple folks mentioned running NFS over TCP for linux.
However, I've encountered data corruption (i.e. running a filer hosted binary will core dump) with RH 7.1 and 7.2 with NFS over TCP. Is there some patch/kernel rev/something I need to apply???
I stopped pursuing this because RedHat and NetApp said this was a known issue. Sounds like some people have it working though.
Tom
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com]On Behalf Of Steve Losen Sent: Monday, March 10, 2003 6:04 PM To: toasters@mathworks.com Subject: Re: Weired Linux error
A few RedHat 7.2 nfs clients mounting a F840 volume via automount error
out with
the following. I have looked into everyting including NIS timeouts, filer sysstat, network performance and everything looks to be fine. Intrestingly
all
the other non linux Clients accessing the same auto-mount are working
perfectly.
Mar 9 06:35:12 linuxclient kernel: nfs: server filer OK Mar 9 06:35:13 linuxclient last message repeated 52 times Mar 9 06:35:31 linuxclient kernel: nfs: server filer not responding,
still
trying Mar 9 06:35:41 linuxclient kernel: nfs: server filer OK Mar 9 06:35:54 linuxclient kernel: nfs: server filer not responding,
still
trying Mar 9 06:36:34 linuxclient kernel: nfs: server filer OK Mar 9 06:36:46 linuxclient kernel: nfs: server filer not responding,
still
trying Mar 9 06:38:36 linuxclient kernel: nfs: task 41929 can't get a request
slot
Mar 9 06:38:41 linuxclient kernel: nfs: task 43328 can't get a request
slot
Be sure you are mounting with the "tcp" flag. We have found that a Linux NFS v3 client using udp has this problem, but it goes away if you use tcp. I think Linux defaults to udp.
You may need to enable tcp on the filer with
options nfs.tcp.enable on
A couple folks mentioned running NFS over TCP for linux.
However, I've encountered data corruption (i.e. running a filer hosted binary will core dump) with RH 7.1 and 7.2 with NFS over TCP. Is there some patch/kernel rev/something I need to apply???
I stopped pursuing this because RedHat and NetApp said this was a known issue. Sounds like some people have it working though.
Tom
I am running RH 7.3 (kernel 2.4.18-5) using NFS v3 over TCP and have not seen any problems. I did see problems with UDP. Here is a typical /etc/fstab line:
filer:/vol/vol0 /vol0 nfs rw,hard,intr,tcp,bg 0 0
Steve Losen scl@virginia.edu phone: 434-924-0640
University of Virginia ITC Unix Support
A couple folks mentioned running NFS over TCP for linux.
However, I've encountered data corruption (i.e. running a filer hosted binary will core dump) with RH 7.1 and 7.2 with NFS over TCP. Is there some patch/kernel rev/something I need to apply???
I stopped pursuing this because RedHat and NetApp said this was a known issue. Sounds like some people have it working though.
Tom
I am running RH 7.3 (kernel 2.4.18-5) using NFS v3 over TCP and have not seen any problems. I did see problems with UDP. Here is a typical /etc/fstab line:
filer:/vol/vol0 /vol0 nfs rw,hard,intr,tcp,bg 0 0
RH 7.3 + 2.4.20-ac2 is ROCK solid, tcp/udp doesnt matter.
I have 100/GigE mixed and default mount options via autofs-testing-v4 (from kernel.org)
/dev/null
devnull@adc.idt.com
Sounds like RPC is working fine for you, otherwise NIS would have problems. Two thoughts, from when I've seen this in the past, for you:
1) Make sure you are using NFSv2. Linux NFSv3 is still fairly buggy.
2) Explicity define your read and write sizes. You might want to spec the retransmit as well, for now try adding the following client side mount options and see if things get better, if not tune the size alittle. Also don't use the intr option. use the options like this: mount -o rsize=8192,wsize=8192 server:/mount/point /localmount
If these don't work for you, then sorry to say, it's a networking issue. Double check your NIC settings on the Linux client to make sure you are full duplex 100, if that's what your Filer is using. Use the linux tool "mii-tool" to check. Using "mii-tool -v" you can see what the current state is, and then use the tool again to force.
If you exhast these, let me know. But one of the above should fix the problem.
benr.
On Mon, Mar 10, 2003 at 02:25:51PM -0800, Premanshu Jain wrote:
A few RedHat 7.2 nfs clients mounting a F840 volume via automount error out with the following. I have looked into everyting including NIS timeouts, filer sysstat, network performance and everything looks to be fine. Intrestingly all the other non linux Clients accessing the same auto-mount are working perfectly.
Mar 9 06:35:12 linuxclient kernel: nfs: server filer OK Mar 9 06:35:13 linuxclient last message repeated 52 times Mar 9 06:35:31 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:35:41 linuxclient kernel: nfs: server filer OK Mar 9 06:35:54 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:36:34 linuxclient kernel: nfs: server filer OK Mar 9 06:36:46 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:38:36 linuxclient kernel: nfs: task 41929 can't get a request slot Mar 9 06:38:41 linuxclient kernel: nfs: task 43328 can't get a request slot
Duplex issue. RH7.2 should have mii-tool installed, so check the the duplex setting on your interfaces, then verify what they are on your switches("mii-tool eth0" for example).
Problem with Linux distro's are that the developers are modifying drivers that were written by Donald Becker(he created the drivers which work flawlessly under nway negotiation, but the Linux developers modify the drivers, hadrcode duplex, break nwat negotiation and you then run into duplex issues and then get crappy nfs performance). The problem is that the cards and switch should autosense to 100baseTx-FD, but since some people in the Linux community like to modify w/o testing they dont realize that when they force duplex/speed in the driver that it then causes nway negotiation to break and you run into speed/duplex problems. So, verify your speed and duplex for all of your gear(systems and switches).
Also, send output of a ifconfig and also do a speed test with ftp to see the changes in speed when you have duplex misconfigured and when it is working correclty - you should see hell of a difference.
On Mon, 10 Mar 2003, Premanshu Jain wrote:
A few RedHat 7.2 nfs clients mounting a F840 volume via automount error out with the following. I have looked into everyting including NIS timeouts, filer sysstat, network performance and everything looks to be fine. Intrestingly all the other non linux Clients accessing the same auto-mount are working perfectly.
Mar 9 06:35:12 linuxclient kernel: nfs: server filer OK Mar 9 06:35:13 linuxclient last message repeated 52 times Mar 9 06:35:31 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:35:41 linuxclient kernel: nfs: server filer OK Mar 9 06:35:54 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:36:34 linuxclient kernel: nfs: server filer OK Mar 9 06:36:46 linuxclient kernel: nfs: server filer not responding, still trying Mar 9 06:38:36 linuxclient kernel: nfs: task 41929 can't get a request slot Mar 9 06:38:41 linuxclient kernel: nfs: task 43328 can't get a request slot