Hello,
We are new to the Netapp and are having the problem that while copying from the Netapp to either an NT (CIFS) or Sun Solaris 2.6 box (NFS), the throughput ranges from 24K/sec to 79K/sec. Copying from an NT or Sun is wonderfully fast (we have trunked the Quad Ethernet). I only know NFS but am working with the NT guys. On the NFS side, I tried a larger packet size (32K instead of 8K), I tried vers 2 and 3, I tried UDP and TCP, and I tried various combos and permuations. I tried all this even though I believe it must be some configuration parameter on the Netapp *common* to both the CIFS stuff and NFS.
We are currently running 5.3.1 so please let me know if you think 5.3.4 will help or even fix it. The two of us both searched NOW and found CIFS or NFS specific troubleshooting stuff which we tried. We also looked through the archives. We are stumped. I am wondering if we forgot to configure something obvious. We have not done a lot of "tuning" on the NetApp, most of the posts here refer to tuning tips on the client side and I am going to try some of that too but first the basic problem needs to be resolved. I have included the output of the options command in the hopes that something there.
thanks for any help/advice you can give
Adam
netapp1> options autosupport.doit DONT autosupport.enable on autosupport.from adamh@my.dom.ain autosupport.mailhost mailhost.my.dom.ain autosupport.noteto autosupport.to autosupport@netapp.com,adam@my.dom.ain,etc. cifs.access_logging.enable off cifs.access_logging.file_name /etc/log/adtlog.evt cifs.bypass_traverse_checking on cifs.guest_account cifs.home_dir /vol/vol0/home cifs.idle_timeout 1800 cifs.netbios_aliases cifs.oplocks.enable on cifs.perm_check_use_gid on cifs.save_case off cifs.scopeid cifs.search_domains cifs.show_snapshot off cifs.symlinks.cycleguard on cifs.symlinks.enable on cifs.trace_login off console.encoding nfs dns.domainname my.dom.ain dns.enable on ftpd.enable off httpd.admin.enable on httpd.enable off httpd.log.max_file_size 2147483647 httpd.rootdir XXX httpd.timeout 900 httpd.timewait.enable off https.admin.enable off ip.match_any_ifaddr on ip.path_mtu_discovery.enable on nfs.mount_rootonly on nfs.per_client_stats.enable on nfs.tcp.enable on nfs.udp.xfersize 32768 nfs.v2.df_2gb_lim off nfs.v3.enable off nfs.webnfs.enable off nfs.webnfs.rootdir XXX nfs.webnfs.rootdir.set off nis.domainname nis.enable off pcnfsd.enable off pcnfsd.umask 22 raid.reconstruct_speed 4 raid.scrub.enable on raid.timeout 24 rsh.enable on snmp.enable on ssh.enable off telnet.enable on telnet.hosts * timed.enable off timed.log off timed.max_skew 30m timed.proto ntp timed.sched hourly timed.servers vol.copy.throttle 10 wafl.convert_ucode off wafl.create_ucode off wafl.default_nt_user wafl.default_unix_user pcuser wafl.maxdirsize 10240 wafl.nt_admin_priv_map_to_root on wafl.root_only_chown on wafl.wcc_minutes_valid 20 netapp1>
email: a d a m - s (a t) p a c b e l l d o t n e t
adam-s@pacbell.net writes:
We are new to the Netapp and are having the problem that while copying from the Netapp to either an NT (CIFS) or Sun Solaris 2.6 box (NFS), the throughput ranges from 24K/sec to 79K/sec. Copying from an NT or Sun is wonderfully fast (we have trunked the Quad Ethernet).
That sounds like a duplex mismatch between the filer and the other end (switch?). Ensure that you explicitly set each end to the same speed and duplex (e.g, 100baseT full duplex) because the filer doesn't autonegotiate very well (if at all?). On the filer you'd use something like: ifconfig e0 mediatype 100tx-fd aa.bb.cc.dd netmask nn.oo.pp.qq
Hope that helps, Luke.
On Thu, 20 Jan 2000 adam-s@pacbell.net wrote:
We are new to the Netapp and are having the problem that while copying from the Netapp to either an NT (CIFS) or Sun Solaris 2.6 box (NFS), the throughput ranges from 24K/sec to 79K/sec. Copying from an NT or Sun is wonderfully fast (we have trunked the Quad Ethernet).
The immediate suggestion is to check the port speed and duplex settings on the Ethernet interfaces of your filer, your NT and Solaris clients, and your switch. Force them all to full duplex 100 Mbps and see what happens. You may also want to try backing down to just a single 100baseT link for testing (to eliminate the possibility that trunking is causing problems).
Thanks to Brian, Luke, and Michael for their speedy and dead-on replies, as well as Robert and Keith who emailed me directly.
The problem was indeed that we had all our switches and hosts at forced full duplex, while the NetApp was still at auto. Problem is now fixed. I only regret whatever keywords I used to search the NOW site and mailing list archives failed to turn up what is clearly a F.A.Q.!
We did the ifconfig with the full mediatype on the trunk itself and the 4 individual ethernets "automagically" changed to forced full. Please let me know if this is not the optimal way to change these. (this change and one for the single ethernet d0 are now in the rc file)
By the way, do most people using NFS use TCP instead of UDP because there is more to tune/tweak with TCP? I am trying to get this to become an Oracle backend, but copies currently average 1MB/sec and this is on an 800MB trunk. I was following some recent discussions regarding tuning and then tried some of the values for first UDP and then TCP and the TCP based transfers to a Sun 2.6 Ultra 60 were faster than UDP. We have a very new and fast 100based switched network so the docs would imply we should go w/ UDP for the lesser overhead, but I guess if there is more to tune one's particular client on TCP then that would be the optimum protocol.
Running Version 2 or Version 3 seemed to make no difference - does anyone have any opinions or options that would suggest otherwise?
Anyhow, thanks again to all who got us back on the right track.
Adam
On Thu, 20 Jan 2000 07:34:11 -0500 (EST), you wrote:
On Thu, 20 Jan 2000 adam-s@pacbell.net wrote:
We are new to the Netapp and are having the problem that while copying from the Netapp to either an NT (CIFS) or Sun Solaris 2.6 box (NFS), the throughput ranges from 24K/sec to 79K/sec. Copying from an NT or Sun is wonderfully fast (we have trunked the Quad Ethernet).
The immediate suggestion is to check the port speed and duplex settings on the Ethernet interfaces of your filer, your NT and Solaris clients, and your switch. Force them all to full duplex 100 Mbps and see what happens. You may also want to try backing down to just a single 100baseT link for testing (to eliminate the possibility that trunking is causing problems).
email: a d a m - s (a t) p a c b e l l d o t n e t
In article 388b7f3e.460606@mail.pacbell.net, adam-s@pacbell.net wrote:
By the way, do most people using NFS use TCP instead of UDP because there is more to tune/tweak with TCP?
UDP will always be faster over lightly loaded lans, these days with switched 100bT at the entry level the only time to use tcp is over wan links.
Running Version 2 or Version 3 seemed to make no difference - does anyone have any opinions or options that would suggest otherwise?
We have found version 2 slightly faster than 3 for our workloads, having said that we are slightly atypical. Unless you are doing lots of small ios then v3 will usually be slightly faster due to the increased block sizes.
adam-s@pacbell.net writes:
By the way, do most people using NFS use TCP instead of UDP because there is more to tune/tweak with TCP? I am trying to get this to become an Oracle backend, but copies currently average 1MB/sec and this is on an 800MB trunk.
I assume by `800MB trunk' you mean a quad 100bT trunked at full duplex?
Just a point to consider; you will not get more than 100Mb/s in any one direction between any two clients. I.e, the maximum you generally can get between any two clients connected via a trunk is the bandwidth of the individual elements that make up the trunk.
(On a side note, I also find it partially dishonest that switch vendors advertise their ports as 100bT or 1000-SX, and then claim they have a 800Mb trunk (4 x 100bT full duplex) or 4Gb trunk (2 x 1000-SX)).
I believe this is a limitation of all the current trunking technologies (at least, the one that Sun, Cisco, NetApp, and 3Com use for 100bT in any case).
IMHO, if the filer is to be primarily used as a backend for one large ORACLE box, and you need bandwidth between them, put gigabit NICs in at each end. You could even just cross-over the fibre cable for a point-point connection if you didn't need gigabit between anything else at the start (and use the 100bT onboard to talk to the other clients).
Luke.
On Sat, 22 Jan 2000, Luke Mewburn wrote:
I believe this is a limitation of all the current trunking technologies (at least, the one that Sun, Cisco, NetApp, and 3Com use for 100bT in any case).
Whoa! Sun supports round robin on its trunks so you CAN get 400Mbps in one direction. NetApp will allegedly do whatever the link partner will do. Thus, although I haven't verified it, if you connect a Sun and a NAC back to back you should get 400Mbps.
What's really interesting is that the difference in cost on the Sun client side favors gigabit, i.e. last time I checked the qfe and trunking software were significantly more expensive than the gig. I haven't looked into the switch costs as in my organization the network falls under the jurisdiction of a separate entity.
Tom
----- Original Message ----- From: adam-s@pacbell.net To: toasters@mathworks.com Sent: Friday, January 21, 2000 7:55 AM Subject: Re: slow copies from Netapp, very fast to copy up to the Netapp
By the way, do most people using NFS use TCP instead of UDP because there is more to tune/tweak with TCP? I am trying to get this to become an Oracle backend, but copies currently average 1MB/sec and this is on an 800MB trunk. I was following some recent discussions regarding tuning and then tried some of the values for first UDP and then TCP and the TCP based transfers to a Sun 2.6 Ultra 60 were faster than UDP. We have a very new and fast 100based switched network so the docs would imply we should go w/ UDP for the lesser overhead, but I guess if there is more to tune one's particular client on TCP then that would be the optimum protocol.
Running Version 2 or Version 3 seemed to make no difference - does anyone have any opinions or options that would suggest otherwise?
UDP is almost always going to be faster than TCP on modern LANs. Version 2 vs. 3 is something of a toss-up... it depends on the ops mix of the client and how good its v3 implementation is. I would use v3 first, and then only switch to v2 if there was a problem.
Bruce
Adam,
You're not alone. After you eliminate duplex, cabling and client patches, if you still have slow performance, then we may have common problems.
adam-s@pacbell.net wrote:
Hello,
We are new to the Netapp and are having the problem that while copying from the Netapp to either an NT (CIFS) or Sun Solaris 2.6 box (NFS), the throughput ranges from 24K/sec to 79K/sec. Copying from an NT or Sun is wonderfully fast (we have trunked the Quad Ethernet). I only know NFS but am working with the NT guys. On the NFS side, I tried a larger packet size (32K instead of 8K), I tried vers 2 and 3, I tried UDP and TCP, and I tried various combos and permuations. I tried all this even though I believe it must be some configuration parameter on the Netapp *common* to both the CIFS stuff and NFS.