amd / netapp issue? - toasters

26 May 2005


      Hi. I'm sending this to am-utils and toasters since
I'm not entirely sure where the problem is. These
systems (redhat73-1, redhat73-2, etc.) are in a
cluster. These errors all occur around the same time
on all the clients (freqently different jobs) but
sometimes spread out across 10-30 minutes.
My first guess is simply that the netapps are to busy
to answer the clients.
*If* that's the case, then what can I change on the
netapp or the client-end to mitigate this problem?
What other options are there?
From the /var/log/amd on clients:
redhat73-1: May 25 13:48:01 redhat73-1 amd[965]/error:
get_nfs_version: failed to contact portmapper on host
"netapp1": RPC: Timed out
redhat73-2: May 25 13:48:07 redhat73-2 amd[965]/error:
get_nfs_version: failed to contact portmapper on host
"netapp2": RPC: Timed out
clients are dual Xeons running redhat 7.3
    am-utils-6.0.7-4
    2.4.20-28.7smp
    mount
options=(rw,hard,intr,grpid,retrans=30,timeo=30,retry=10,dev=00000010,vers=3,proto=tcp)
    	# Interesting note: We actually set retry=10000 but
the rh7.3 systems use 10 instead
netapps are 960/980 class systems
    ontap=6.5.2R1P13
    They are typically very busy (100%) when the errors
occur
Network switches show no errors
NIC cards on client and netapps show nothing
Nothing in the messages file in clients nor on the
netapps.
Thanks in advance and I'll summarize if I get some
good input.
- Jay
__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com