On Fri, 23 Jan 2004 08:44:25 -0500 Steve Losen scl@sasha.acc.Virginia.EDU wrote:
Hi all,
we've found some odd failure messages at all linux machines running 2.4.21 here. Those machines report "NFS: server filerX-vif1 not responding, timed out" over the whole day again and again.
All machines are either some DELL 2650 or HP Proliant DL360/DL380. All machines have a "Broadcom Corporation NetXtreme" Gigbit NIC. It seems that the same type of machines with linux kernel 2.4.18 have no NFS problems.
Has anyone seen the same problems and if so - is it a kernel problem? Is 2.4.23 more stable with NFS or should I downgrade everything to 2.4.18? Someone from netapp here with linux customers? ;-)
I have seen that error on Linux when using NFS over UDP, including NFS servers other than a Netapp. I switched to TCP for all my NFS mounts and haven't had any trouble. Here is my /etc/fstab entry:
filer:/vol/vol0/dir /dir nfs rw,hard,intr,tcp,bg 0 0
I'm running 2.4.20-24.7 (RH 7.3).
After investigating it we think it might be a problem with the broadcom NIC drivers in 2.4.21. All hosts with 2.4.21 produce heave tx errors and collisions on all switches. We will try new kernels on a few machines now - hopefully the NFS problems will disappear without network problems.
Greetings,