Performance probs with UDP NFSv3.

List overview All Threads
Download

newer

older

RE: Performance probs with UDP...

Re: CIFS, ownership and chowning...

foo

28 Jul 2000 28 Jul '00

4:17 a.m.

Has anyone else experienced performance problems using UDP NFSv3 between Solaris 2.7 and an F760 (running 5.3.5R2)?

Using v3 I get the following errors intermittently:

Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 ok

When I switched the mount to v2 (udp), these errors went away, and performance seemed to increase. Netapp documentation recommends v3, but it doesnt seem to work very well in my environment.

Since this filer is used for a DB, this is causing huge problems. Should I just leave it at v2, or is worth trying to determine why v3 is performing poorly?

-Brian

Show replies by date

Michael Salmon

28 Jul 28 Jul

6:38 a.m.

foo wrote:

...

Has anyone else experienced performance problems using UDP NFSv3 between Solaris 2.7 and an F760 (running 5.3.5R2)?

Using v3 I get the following errors intermittently:

Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 ok

When I switched the mount to v2 (udp), these errors went away, and performance seemed to increase. Netapp documentation recommends v3, but it doesnt seem to work very well in my environment.

Since this filer is used for a DB, this is causing huge problems. Should I just leave it at v2, or is worth trying to determine why v3 is performing poorly?

We have had similar problems to those that Jeff mentioned, basically 32K blocks over UDP is, in general, not a good solution. We forced 8K blocks and the performance improved considerably. In a way the poor performance was caused by too powerful servers, they could send more data than the clients and routers could handle.

/Michael

-- This space intentionally left non-blank.

foo

6:52 p.m.

The network at the moment is like this:

The reason for the 100Mb in between is that for the moment the 4500 and the filer are in different parts of a building (power issues prevent having them in the same place) which is connected via 100Mb.

That said, I spent a lot of time looking at the switches and network to see if it was causing the problems. The fast ethernet interface has never gone above around 30Mb or so, there are no errors of any kind of any of the ints (no collisions, no align/fcs/giant/short, nothing *at all*). Everything is full-duplex, flowcontrol is off on everything (any thoughts about that?), all of the cabling was tested before installation... in short, the network looks right.

Eventually the filer and DB will be connected directly to each other, but I have trouble believing that this will solve the current problems considering there is currently no congestion in the network.

The DB has all of the recent patches, but no optimization has been done.

What am I missing?

-Brian

foo wrote:

...

Has anyone else experienced performance problems using UDP NFSv3 between Solaris 2.7 and an F760 (running 5.3.5R2)?

Using v3 I get the following errors intermittently:

Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 ok

When I switched the mount to v2 (udp), these errors went away, and performance seemed to increase. Netapp documentation recommends v3, but it doesnt seem to work very well in my environment.

Since this filer is used for a DB, this is causing huge problems. Should I just leave it at v2, or is worth trying to determine why v3 is performing poorly?

-Brian

Krishnan Prabhakar

7:54 p.m.

...

The network at the moment is like this:

[760]

flow-control need to be turned ON

...

| Gig-e | | [Foundry FastIron]

flow-control need to be turned ON on the port

...

| | 100Mb/s Ethernet | | [Foundry FastIron]

flow-control need to be turned ON

...

| | Gig-e |

flow-control need to be turned ON

...

[Sun e4500]

i guess the bottle-neck is the 100bt inbetween 10x faster Gig-e links on both-sides. ( FastIron cant forward the packets through 100Mb/s port as fast as it receives packets through 1000Mb/s port ie. from filer/e4500)

Usually its done the otherway ( 100bt -- 1000sx pipe - 100bt ) the above network topology is not 'appropriate' for a server network!

i guess there should be lots of 'dropped'/retransmits at the both ends ( 'netstat -s' )

Note : this is my first guess based on the info. in this thread.

...

The reason for the 100Mb in between is that for the moment the 4500 and the filer are in different parts of a building (power issues prevent having them in the same place) which is connected via 100Mb.

That said, I spent a lot of time looking at the switches and network to see if it was causing the problems. The fast ethernet interface has never gone above around 30Mb or so, there are no errors of any kind of any of the ints (no collisions, no align/fcs/giant/short, nothing *at all*). Everything is full-duplex, flowcontrol is off on everything (any thoughts about that?), all of the cabling was tested before installation... in short, the network looks right.

Eventually the filer and DB will be connected directly to each other, but I have trouble believing that this will solve the current problems considering there is currently no congestion in the network.

The DB has all of the recent patches, but no optimization has been done.

What am I missing?

-Brian

foo wrote:

...
Has anyone else experienced performance problems using UDP NFSv3 between Solaris 2.7 and an F760 (running 5.3.5R2)?

Using v3 I get the following errors intermittently:

Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:41:07 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:42:06 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:43:30 bullwinkle unix: NFS server 10.20.10.10 ok Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 not responding still trying Jul 27 20:45:40 bullwinkle unix: NFS server 10.20.10.10 ok

When I switched the mount to v2 (udp), these errors went away, and performance seemed to increase. Netapp documentation recommends v3, but it doesnt seem to work very well in my environment.

Since this filer is used for a DB, this is causing huge problems. Should I just leave it at v2, or is worth trying to determine why v3 is performing poorly?

-Brian

9158

Age (days ago)

9158

Last active (days ago)

toasters@lists.teaparty.net

3 comments

4 participants

tags (0)

participants (4)

foo
foo
Krishnan Prabhakar
Michael Salmon