Re: Poor NFS 10GbE performance on NetApp 6080s

19 May 2012


      Good to know, I spent most of my day testing yesterday and noticed that
Jumbo frames helped a little bit but not a whole lot. Here are my results:
1) Equipment & Lab Environment Configuration
* NFS Volume = 1 x 25GB NFS share (aggr1) on 6080 #1 (oplocks disabled)
    * Mounted to test server with the RHEL default mount options which are:
    *  
(rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=s
ys)
    * Volume was mounted to "/mnt" on the test server
* Server Hardware = 1 x 3850 X5 with the following specs:
    * HOSTNAME: testserver
    * CPU: 2 x Intel E7540 @ 2.00GHz
    * RAM: 128GB
    * Local HDD: 2x146GB (RAID1)
    * 1GbE NIC: Intel Quad-port 1GbE 82580
    * 10GbE NIC: Intel 10GbE x520 SFP+ PCIe 2.0 card
* Server OS
    * Version: RHEL 5.7+ with the "kernel-2.6.18-308.4.1.el5" (5.8) kernel -
required by the Intel 10GbE NIC
    * Each NIC had two active ports which were combined to form an
active/passive bond known as "bond1"
3) testserver to 6080 #1 Network Throughput Test with 5GB file creation
(dd if=/dev/zero of=/mnt/testfile bs=1024 count=5242880) * 1GbE w/o Jumbo
Frames (old storage network that talks to 2x1GbE interfaces on 6080 #1) =
5368709120 bytes (5.4 GB) copied, 99.0798 seconds, 54.2 MB/s
* 1GbE w/o Jumbo Frames (new storage network that talks to 2x10GbE
interfaces on 6080 #1) = 5368709120 bytes (5.4 GB) copied, 70.8844
seconds, 75.7 MB/s
* 1GbE + Jumbo Frames = 5368709120 bytes (5.4 GB) copied, 67.1208 seconds,
80.0 MB/s
* 10GbE w/0 Jumbo Frames = 5368709120 bytes (5.4 GB) copied, 45.9469
seconds, 117 MB/s
* 10GbE + Jumbo Frames = 5368709120 bytes (5.4 GB) copied, 38.8961
seconds, 138 MB/s
4) testserver to 6080 #1 Network Throughput Test + RHEL OS tweaking with
5GB file creation (dd if=/dev/zero of=/mnt/testfile bs=1024 count=5242880)
* 10GbE + Jumbo Frames with kernel defaults plus:
    * sunrpc kernel module parameter change = 5368709120 bytes (5.4 GB)
copied, 39.0274 seconds, 138 MB/s
    	* Echoed "options sunrpc tcp_slot_table_entries=128" to
"/etc/modprobe.d/sunrpc.conf" according to
https://access.redhat.com/knowledge/solutions/69275 & rebooted
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.core.rmem_default = 262144" = 5368709120 bytes (5.4 GB) copied,
39.0149 seconds, 138 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.core.rmem_max = 16777216" = 5368709120 bytes (5.4 GB) copied,
32.7018 seconds, 164 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.core.wmem_default = 262144" = 5368709120 bytes (5.4 GB) copied,
33.245 seconds, 161 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.core.wmem_max = 16777216" = 5368709120 bytes (5.4 GB) copied,
33.2526 seconds, 161 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.ipv4.tcp_rmem = 4096 262144 16777216" = 5368709120 bytes (5.4 GB)
copied, 35.2615 seconds, 152 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.ipv4.tcp_wmem = 4096 262144 16777216" = 5368709120 bytes (5.4 GB)
copied, 33.0321 seconds, 163 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.ipv4.tcp_window_scaling = 1" = 5368709120 bytes (5.4 GB) copied,
33.5698 seconds, 160 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.ipv4.tcp_syncookies = 0" = 5368709120 bytes (5.4 GB) copied,
32.7373 seconds, 164 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.ipv4.tcp_timestamps = 0" = 5368709120 bytes (5.4 GB) copied,
34.0019 seconds, 158 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "net.ipv4.tcp_sack = 0" = 5368709120 bytes (5.4 GB) copied, 35.3956
seconds, 152 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "cpuspeed" service stopped = 5368709120 bytes (5.4 GB) copied, 32.8168
seconds, 164 MB/s
* 10GbE + Jumbo Frames with kernel defaults, previous change(s), plus:
    * "irqbalance" service stopped = 5368709120 bytes (5.4 GB) copied,
32.6841 seconds, 164 MB/s
Regarding the test server itself, nothing is running on it except for my
test processes (it is a test box that has the exact same hardware
configuration as the Oracle EBS DB servers)
Regards,
Dan Burkland
On 5/19/12 2:01 PM, "Dalvenjah FoxFire" dalvenjah@DAL.NET wrote:
...
One more thing -- I know this may spawn debate, and I didn't do it
terribly scientifically, but a year or two ago we tested jumbo frames
with 1GB links, between a pair of 6070s and a couple of boxes with Intel
server NICs in them, and we actually found performance to get worse
(though not very much) with jumbo frames than with normal 1500-byte
frames. It certainly didn't improve performance.
My theory is that all the ASIC TCP checksum offloading and such is so
optimized for 1500-byte packets, that when you get up to the 9000-byte
frame size, things have to go back to software, and you don't end up with
a speed boost.
I could be wrong, but you might want to test with 1500-byte MTU set on
both ends, just to see.
-dalvenjah
On May 19, 2012, at 11:46 AM, Dan Burkland wrote:
...
I know dd isn't the best tool since it is a single threaded application
and in no way represents the workload that Oracle will impose. However,
I
thought it would still give me a decent ballpark figure regarding
throughput. I tried a block size of 64k, 128k, and 1M (just to see) and
got a bit more promising results:
# dd if=/dev/zero of=/mnt/testfile bs=1M count=5120
5120+0 records in
5120+0 records out
5368709120 bytes (5.4 GB) copied, 26.6878 seconds, 201 MB/s
If I run two of these dd sessions at once the throughput figure above
gets
cut in half (each dd session reports it creates the file at around
100MB/s).
As far as the switch goes, I have not checked it yet however I did
notice
that flow control is set to full on the 6080 10GbE interfaces. We are
also
running Jumbo Frames on all of the involved equipment.
As far as the RHEL OS tweaks go, here are the settings that I have
changed
on the system:
###
/etc/sysctl.conf:
# 10GbE Kernel Parameters
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 262144 16777216
net.ipv4.tcp_wmem = 4096 262144 16777216
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
#
###
###
/etc/modprobe.d/sunrpc.conf:
options sunrpc tcp_slot_table_entries=128
###
###
Mount options for the NetApp test NFS share:
rw,vers=3,rsize=65536,wsize=65536,hard,proto=tcp,timeo=600,retrans=2,sec=
sy
s
###
Thanks again for all of your quick and detailed responses!
Dan
On 5/19/12 1:08 PM, "Robert McDermott" rmcdermo@fhcrc.org wrote:
...
Your block size is only 1K; try increasing the block size and the
throughput will increase. 1K IOs would generate a lot of IOPs with very
little throughput.
-Robert
Sent from my iPhone
On May 19, 2012, at 10:48, Dan Burkland dburklan@NMDP.ORG wrote:
...
Hi all,
My company just bought some Intel x520 10GbE cards which I recently
installed into our Oracle EBS database servers (IBM 3850 X5s running
RHEL
5.8). As the "linux guy" I have been tasked with getting these servers
to
communicate with our NetApp 6080s via NFS over the new 10GbE links. I
have
got everything working however ever after tuning the RHEL kernel I am
only
getting 160MB/s writes using the "dd if=/dev/zero of=/mnt/testfile
bs=1024
count=5242880" command. For you folks that run 10GbE to your toasters,
what write speeds are you seeing from your 10GbE connected servers?
Did
you have to do any tuning in order to get the best results possible?
If
so
what did you change?
Thanks!
Dan

Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: Poor NFS 10GbE performance on NetApp 6080s