Budtool and NetWorker

List overview All Threads
Download

newer

older

problems mixing 18/9 GB disks?

Problem with F630 and qualified...

Jay Orr

6 Jan 2000 6 Jan '00

6:33 p.m.

Has anyone that uses Budtool to backup gotten any details about the option to upgrade to NetWorker? I got info from PDC about things, but other then the usual marketing hype as to how wonderful everything is the only substantial detail about upgrading and costs are :

"Companies that buy BudTool can upgrade to the NDMP Connection and a similarly configured NetWorker Server within one year after general availability, at no additional cost."

Ok, what about existing customers? I looked on their web site, about the same marketing fluff with no substance. Before many companies consider migrating, the very first question will always be "HOW MUCH??". My question too - we all try to avoid the fight of the budgets. ;p

----------- Jay Orr Systems Administrator Fujitsu Nexion Inc. St. Louis, MO

Show replies by date

Paul Taylor

10 Jan 10 Jan

9:12 p.m.

New subject: NFS UDP xfer size

We have been having some performance issues lately, and I was looking through some documentation that I got when I went to the NetApp 202 class. It says that the udp window should 32k except if you have a FDDI card or rampant packet loss. You set this with:

options nfs.udp.xfersize <value>

Anyone had any experience setting this? can we do it on a production filer and not crash? Can we set it back and still not crash if it does something horrid?

Paul Taylor Sr. Systems Engineer Connectria 215-841-5540

John K. Edwards

10:21 p.m.

New subject: NFS UDP xfer size

I'm not an expert on this, but quick inspection shows that our production engineering filers are using 32kB UDP transfers.

John

On Mon, 10 Jan 2000, Paul Taylor wrote:

...

We have been having some performance issues lately, and I was looking through some documentation that I got when I went to the NetApp 202 class. It says that the udp window should 32k except if you have a FDDI card or rampant packet loss. You set this with:

options nfs.udp.xfersize <value>

Anyone had any experience setting this? can we do it on a production filer and not crash? Can we set it back and still not crash if it does something horrid?

Paul Taylor Sr. Systems Engineer Connectria 215-841-5540

Brian Tao

11 Jan 11 Jan

12:42 a.m.

New subject: NFS UDP xfer size

On Mon, 10 Jan 2000, Paul Taylor wrote:

...

options nfs.udp.xfersize <value>

Anyone had any experience setting this? can we do it on a production filer and not crash? Can we set it back and still not crash if it does something horrid?

I've never had any problems setting this value on-the-fly. On Solaris, use "nfsstat -m" to verify that the client is in fact using 32K block sizes (you will have to remount).

-- Brian Tao (BT300, taob@risc.org) "Though this be madness, yet there is method in't"

Michael S. Keller

2:20 p.m.

New subject: NFS UDP xfer size

Paul Taylor wrote:

...

We have been having some performance issues lately, and I was looking through some documentation that I got when I went to the NetApp 202 class. It says that the udp window should 32k except if you have a FDDI card or rampant packet loss. You set this with:

options nfs.udp.xfersize <value>

Anyone had any experience setting this? can we do it on a production filer and not crash? Can we set it back and still not crash if it does something horrid?

I had no trouble enabling it on a production filer (F760 cluster, in this case). It helped only NFS v3. In fact, our news farm runs NFS v2 on UDP because of horrible performance running any form of NFS v3 between Solaris 2.6 and the filers. I have one machine tightened to 512-byte transfers because of poor performance at bigger transfer sizes.

I'd welcome discussion on this, if anyone wants to take it to a new thread.

tkaczma＠gryf.net

5:03 p.m.

New subject: NFS UDP xfer size

On Tue, 11 Jan 2000, Michael S. Keller wrote:

...

I have one machine tightened to 512-byte transfers because of poor performance at bigger transfer sizes.

I think you really have to look into your network.

Tom

Michael S. Keller

7:49 p.m.

New subject: NFS UDP xfer size

tkaczma@gryf.net wrote:

...

On Tue, 11 Jan 2000, Michael S. Keller wrote:

...
I have one machine tightened to 512-byte transfers because of poor performance at bigger transfer sizes.

I think you really have to look into your network.

There's not much to check. It goes in one switch port and out another on the same switch. The interfaces show no errors. I do have the filer trunked (EtherChannel) and my news clients have hand-tuned MAC addresses to reduce contention, since the switch does "dumb" switching based on MAC addresses instead of loads.

Eyal Traitel

12 Jan 12 Jan

2:39 a.m.

New subject: NFS UDP xfer size

Can you specify more on your MAC address trick ?

Eyal.

"Michael S. Keller" wrote:

...

tkaczma@gryf.net wrote:

...
On Tue, 11 Jan 2000, Michael S. Keller wrote:

...
I have one machine tightened to 512-byte transfers because of poor performance at bigger transfer sizes.

I think you really have to look into your network.

There's not much to check. It goes in one switch port and out another on the same switch. The interfaces show no errors. I do have the filer trunked (EtherChannel) and my news clients have hand-tuned MAC addresses to reduce contention, since the switch does "dumb" switching based on MAC addresses instead of loads.

tkaczma＠gryf.net

6:42 a.m.

New subject: NFS UDP xfer size

On Wed, 12 Jan 2000, Eyal Traitel wrote:

...

Can you specify more on your MAC address trick ?

I think what he means is that he has all the addresses nicely distributed over the number of trunking interfaces he has on the filer. I assume that by "dumb" he means MAC hashing, a method of distributing load among ethernet interfaces based on the ethernet address. Usually x last bits are used where 2^x ~= number of interfaces in the trunk. I haven't found (not that I looked very hard) documentation on what happens in this scenario if one of the trunk links dies. If someone can point me to that excerp of the standard/documentation I'd appreciate it.

Please read on as I've missed Michael's response quoted below.

...

"Michael S. Keller" wrote:

...

...
There's not much to check. It goes in one switch port and out another on the same switch. The interfaces show no errors.

The interfaces on both sides show no errors? How about collisions? If one side shows collisions and the other doesn't then you have yourself a duplex mismatch. I'd check for the same on the client side.

I haven't found anyone yet that has convinced me that duplex negotiation works. I've seen it NOT work too many times and when it doesn't the performance turns worse and worse as congestion increases. I'd hard set the switch and the boxes (including the netapp) to full-duplex if in fact that is what your boxes support.

Make sure it isn't your clients that are having problems with large packets as well as the switch. There are certain options in ATM switches, if I'm recalling correctly, that will effectively reduce your maximum ethernet frame size. That could also put a damper on large packets as they use several maximum sized frames.

I'm not saying that it isn't a problems with NACs, but my experience tells me to look into the networking before jumping to any conclusions. My networking group swore that our performance problems were due to NACs. In fact they were so entrenched in their ideas that they didn't believe me that NACs purposely "ignored" ICMP packets under some conditions. It turned out that we had a case of duplex mismatch between the NACs and the switches. The NAC was happily chuging at full-duplex which it autonegotiated and the switch decided that half-duplex was good enough for some of the interfaces. Also, check the ethernet card on the NAC, I already replaced two of them - well, we have about 20 NACs, but nevertheless I didn't expect it. See what kind of performance you're getting with just one interface enabled at a time and then rotate through the interfaces.

Tom

Michael S. Keller

3:59 p.m.

New subject: NFS UDP xfer size

tkaczma@gryf.net wrote:

...

On Wed, 12 Jan 2000, Eyal Traitel wrote:

...
Can you specify more on your MAC address trick ?

I think what he means is that he has all the addresses nicely distributed over the number of trunking interfaces he has on the filer. I assume that by "dumb" he means MAC hashing, a method of distributing load among ethernet interfaces based on the ethernet address. Usually x last bits are used where 2^x ~= number of interfaces in the trunk. I haven't found (not that I looked very hard) documentation on what happens in this scenario if one of the trunk links dies. If someone can point me to that excerp of the standard/documentation I'd appreciate it.

Correct. I have two interfaces per trunk. I have two trunks per filer quad card. I have one quad card per filer. The Cisco 5505 (Ethernet in this case, not ATM) supports a maximum of four interfaces per trunk. If an interface in a two-interface trunk dies, all traffic re-routes to the remaining interface. The algorithm in the 5505 XORs the last two bits of the sending and receiving MAC addresses to determine the port number. With a large number of clients, this tends to even out. With only four clients, I had to make a truth table of desired results, then change client MAC addresses to achieve the result.

...

Please read on as I've missed Michael's response quoted below.

...
"Michael S. Keller" wrote:

...
...
There's not much to check. It goes in one switch port and out another on the same switch. The interfaces show no errors.

I haven't found anyone yet that has convinced me that duplex negotiation works. I've seen it NOT work too many times and when it doesn't the performance turns worse and worse as congestion increases. I'd hard set the switch and the boxes (including the netapp) to full-duplex if in fact that is what your boxes support.

Duplex is forced full at the switch and at the clients. I fixed that months ago. netstat shows no collisions at the clients or at the filers.

...

Make sure it isn't your clients that are having problems with large packets as well as the switch. There are certain options in ATM switches, if I'm recalling correctly, that will effectively reduce your maximum ethernet frame size. That could also put a damper on large packets as they use several maximum sized frames.

I'm not saying that it isn't a problems with NACs, but my experience tells me to look into the networking before jumping to any conclusions. My networking group swore that our performance problems were due to NACs. In fact they were so entrenched in their ideas that they didn't believe me that NACs purposely "ignored" ICMP packets under some conditions. It turned out that we had a case of duplex mismatch between the NACs and the switches. The NAC was happily chuging at full-duplex which it autonegotiated and the switch decided that half-duplex was good enough for some of the interfaces. Also, check the ethernet card on the NAC, I already replaced two of them - well, we have about 20 NACs, but nevertheless I didn't expect it. See what kind of performance you're getting with just one interface enabled at a time and then rotate through the interfaces.

I may try that, but not immediately.

tkaczma＠gryf.net

9:04 p.m.

New subject: NFS UDP xfer size

On Wed, 12 Jan 2000, Michael S. Keller wrote:

...

I may try that, but not immediately.

According to what you said everything looks right. You've done a good job, and I am as dumbfounded as you are. BTW, do you use locking via NFS. Solaris has a pretty nasty NLM bug which surfaces when used against NetApps. Look into patch 106639 for this and other NFS goodies.

Tom

Michael S. Keller

13 Jan 13 Jan

4:32 p.m.

New subject: NFS UDP xfer size

tkaczma@gryf.net wrote:

...

On Wed, 12 Jan 2000, Michael S. Keller wrote:

...
I may try that, but not immediately.

According to what you said everything looks right. You've done a good job, and I am as dumbfounded as you are. BTW, do you use locking via NFS. Solaris has a pretty nasty NLM bug which surfaces when used against NetApps. Look into patch 106639 for this and other NFS goodies.

Tom

I installed 106639-03. I also installed 106641-01 and 106882-01, mentioned in the notes for 106639-03. The host already had the other patches mentioned in 106639-03's notes. I rebooted the host after patch installation.

The output of `iostat -cxn 10|awk '$8 > 100.0 {print}'` follows. I use it to assess average service time of NFS mounts. v2/UDP/512-byte transfers perform generally better than settings with bigger transfer sizes or v3. 10.16.11[89].10 is the heavily-loaded filer. 10.16.11[89].15 is the lightly-loaded filer. This host uses it without contention from other hosts, yet it has more slow transfer times.

v2/UDP/512-byte transfers:

r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.6 0.0 0.8 0.0 0.0 0.3 0.0 163.3 0 26 10.16.118.15:/vol/vol0/News/4/4 1.6 0.0 0.8 0.0 0.0 0.3 0.0 163.6 0 26 10.16.119.15:/vol/vol0/News/4/1 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 3.7 0.0 1.8 0.0 0.0 0.7 0.0 188.8 0 70 10.16.118.15:/vol/vol0/News/4/6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.3 0.0 0.2 0.0 0.0 0.2 0.0 673.3 0 20 10.16.119.15:/vol/vol0/News/4/3 0.3 0.0 0.2 0.0 0.0 0.4 0.0 1352.2 0 41 10.16.118.15:/vol/vol0/News/4/6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.9 0.0 0.5 0.0 0.0 0.2 0.0 202.6 0 18 10.16.118.15:/vol/vol0/News/4/4 1.3 0.0 0.7 0.0 0.0 0.8 0.0 626.3 0 81 10.16.118.15:/vol/vol0/News/4/6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.7 0.0 0.8 0.0 0.0 0.5 0.0 313.7 0 53 10.16.118.15:/vol/vol0/News/4/4 3.2 0.0 1.6 0.0 0.0 0.5 0.0 163.5 0 52 10.16.119.15:/vol/vol0/News/4/5 3.9 0.0 1.9 0.0 0.0 0.5 0.0 116.3 0 45 10.16.119.15:/vol/vol0/News/4/7 4.8 0.0 2.4 0.0 0.0 0.5 0.0 108.9 0 52 10.16.118.15:/vol/vol0/News/4/8

I prepared more, but my mail client died, losing my composition window. Results for v2/UDP/default sizes and v2/UDP/default sizes are quite similar to the results below for v3/TCP/default sizes.

r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 0.0 3.2 0.0 0.0 0.1 0.0 1454.4 0 15 10.16.119.15:/vol/vol0/News/4/1 0.1 0.0 3.2 0.0 0.0 0.2 0.0 1543.0 0 15 10.16.119.15:/vol/vol0/News/4/5 0.1 0.0 3.2 0.0 0.0 0.1 0.0 1455.6 0 15 10.16.119.15:/vol/vol0/News/4/7 0.2 0.0 6.4 0.0 0.0 0.3 0.0 1346.3 0 27 10.16.119.15:/vol/vol0/News/4/9 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.2 0.0 6.4 0.0 0.0 0.2 0.0 937.6 0 17 10.16.118.15:/vol/vol0/News/4/4 0.1 0.0 3.2 0.0 0.0 0.1 0.1 999.1 0 5 10.16.118.15:/vol/vol0/News/4/8 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.6 0.0 4.5 0.2 0.2 322.7 396.8 19 20 md4 0.0 0.6 0.0 4.5 0.0 0.2 0.0 396.7 0 20 md6 0.1 0.1 3.2 0.8 0.0 0.1 1.6 546.1 0 11 10.16.119.15:/vol/vol0/News/4/9 0.4 0.1 12.8 1.6 0.0 0.2 0.6 317.1 0 8 10.16.118.15:/vol/vol0/News/4/8 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 0.0 3.2 0.0 0.0 0.2 0.0 1548.6 0 15 10.16.119.15:/vol/vol0/News/4/3 0.3 0.0 9.6 0.0 0.0 0.1 0.0 447.0 0 13 10.16.119.15:/vol/vol0/News/4/5 0.1 0.0 3.2 0.0 0.0 0.3 0.0 3035.1 0 30 10.16.118.15:/vol/vol0/News/4/6 0.1 0.0 3.2 0.0 0.0 0.3 0.0 2703.4 0 27 10.16.118.15:/vol/vol0/News/4/8 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 0.0 3.2 0.0 0.0 0.3 0.0 3498.9 0 35 10.16.118.15:/vol/vol0/News/4/2 0.2 0.1 6.4 0.8 0.0 0.3 0.1 1051.8 0 32 10.16.118.15:/vol/vol0/News/4/8

Michael S. Keller

14 Jan 14 Jan

9:31 p.m.

New subject: NFS UDP xfer size

I appear to have a problem with the lightly-loaded filer's quad 10/100 card or the cables connecting it to the switch. I should have more clue on the cables by Monday.

I moved traffic to the lightly-loaded filer to the on-board 10/100 port (F760). iostat looked better. After installing the Solaris 2.6 patches mentioned yesterday that address poor NFS client performance with a NetApp filer, I changed NFS mount options to "vers=3,proto=tcp".

See the output of iostat -cxn 10|awk '$8 > 100.0 {print}':

# iostat -cxn 10|awk '$8 > 100.0 {print}' r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 3.9 0.8 57.3 0.1 0.9 26.8 213.7 11 49 md4 0.1 3.9 0.8 57.3 0.0 0.7 0.0 169.9 0 48 md6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.7 0.0 5.6 0.1 0.3 123.0 468.0 9 21 md4 0.0 0.7 0.0 5.6 0.0 0.3 0.0 468.0 0 21 md6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 5.6 0.0 84.8 0.1 0.7 16.3 131.4 9 24 md4 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 0.6 0.8 4.5 0.0 0.1 48.5 154.3 3 6 md4 0.0 0.6 0.0 4.5 0.0 0.1 0.0 171.5 0 6 md6 0.0 0.4 0.0 6.4 0.0 0.1 0.7 165.0 0 7 10.16.119.10:/vol/vol0/logs r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 5.6 0.0 75.2 0.0 0.6 6.2 115.7 3 18 md4 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device ^C#

Much better.

If I request an RMA on an older quad 10/100, will I receive a newer Intel-based board?

Michael S. Keller

17 Jan 17 Jan

2:55 p.m.

New subject: NFS UDP xfer size

This host continued fine throughout the weekend. I applied the same patches and NFS mount option changes to one of the news feeders (running bCandid's Cyclone product). It got stuck in here:

Jan 14 18:06:13 news-feeder2 unix: NFS server 10.16.118.10 not responding still trying

on Friday. I rebooted it in the last hour. Both Solaris boxes used the same NFS server. I may have to drop TCP. I hope I can keep v3 and its larger transfer sizes.

"Michael S. Keller" wrote:

...

I appear to have a problem with the lightly-loaded filer's quad 10/100 card or the cables connecting it to the switch. I should have more clue on the cables by Monday.

I moved traffic to the lightly-loaded filer to the on-board 10/100 port (F760). iostat looked better. After installing the Solaris 2.6 patches mentioned yesterday that address poor NFS client performance with a NetApp filer, I changed NFS mount options to "vers=3,proto=tcp".

See the output of iostat -cxn 10|awk '$8 > 100.0 {print}':

# iostat -cxn 10|awk '$8 > 100.0 {print}' r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 3.9 0.8 57.3 0.1 0.9 26.8 213.7 11 49 md4 0.1 3.9 0.8 57.3 0.0 0.7 0.0 169.9 0 48 md6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.7 0.0 5.6 0.1 0.3 123.0 468.0 9 21 md4 0.0 0.7 0.0 5.6 0.0 0.3 0.0 468.0 0 21 md6 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 5.6 0.0 84.8 0.1 0.7 16.3 131.4 9 24 md4 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 0.6 0.8 4.5 0.0 0.1 48.5 154.3 3 6 md4 0.0 0.6 0.0 4.5 0.0 0.1 0.0 171.5 0 6 md6 0.0 0.4 0.0 6.4 0.0 0.1 0.7 165.0 0 7 10.16.119.10:/vol/vol0/logs r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 5.6 0.0 75.2 0.0 0.6 6.2 115.7 3 18 md4 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device ^C#

Much better.

If I request an RMA on an older quad 10/100, will I receive a newer Intel-based board?

Michael S. Keller

5:23 p.m.

New subject: NFS UDP xfer size

I had to move the news feeder back to NFS v2/UDP/8K. The reader continues to run fine with v3/TCP/32K.

"Michael S. Keller" wrote:

...

This host continued fine throughout the weekend. I applied the same patches and NFS mount option changes to one of the news feeders (running bCandid's Cyclone product). It got stuck in here:

Jan 14 18:06:13 news-feeder2 unix: NFS server 10.16.118.10 not responding still trying

on Friday. I rebooted it in the last hour. Both Solaris boxes used the same NFS server. I may have to drop TCP. I hope I can keep v3 and its larger transfer sizes.

9348

Age (days ago)

9359

Last active (days ago)

toasters@lists.teaparty.net

14 comments

7 participants

tags (0)

participants (7)

Brian Tao
Eyal Traitel
Jay Orr
John K. Edwards
Michael S. Keller
Paul Taylor
tkaczma＠gryf.net