Thanks all for the quick feedback.   I thought that was a rather high number. 

Responding to some of the questions/comments:

Tony:
Is there a specific reason you need flow control enabled? 
- No.   My best guess is the array was setup without reading the Ethernet Best Practices doc in advance and the default flow control settings were rolled out.   No FCoE so no need for PFC.    This particular array is just serving NFS and CIFS over the 10g interfaces so really no reason that I can see to be out-of-alignment with the best practices.

-I have seen the doc you linked to.   Kudos to whoever authored it - I really like that doc.   The Ethernet Best Practices doc is also really well written, IMO.

Tmac:
I do realize that all protocols higher in the stack will be affected.      I am always concerned when I look at the L2 flow control numbers.

Francis:
This is most likely a case of going with the default configuration as well.   :-(  Thanks for the advance notice on CDOT.  Not at CDOT yet but I will definitely make a note of this!


On Fri, Apr 17, 2015 at 12:30 PM, Tony Bar <tbar@berkcom.com> wrote:

Phil –

 

This is a very high number of pause frames in my opinion, per this KB -- https://kb.netapp.com/support/index?id=3013252&page=content&locale=en_US – each pause frame can add up to 3.355ms of latency which can definitely cause problems with iSCSI or FCoE traffic.

 

Is there a specific reason you need flow control enabled?   It’s not only not best practice for NetApp, but generally not best practice for any block storage which uses Ethernet as a transmission media.   Unless there’s some specific reason you’re enabling it, I would follow the recommendations and turn it off.   Remember, you want to keep latency under 5ms for block storage – especially for database applications.

 

Good luck to you!

 

Anthony Bar | Director of Engineering

650.207.5368 | tbar@berkcom.com

 

Berkeley Communications | www.berkcom.com

NetApp | Cisco | VMware | SuperMicro | Big Data & Analytics | HPC

 

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Philbert Rupkins
Sent: Friday, April 17, 2015 10:23 AM
To: toasters@teaparty.net
Subject: Excessive Flow Control Frames?

 

Toasters,

 

We have ethernet flow control set to full on our NetApp's 10g interfaces.    The FEX ports to which our 10g interfaces are connected also have flow control TX/RX enabled.   I realize enabling flow control is against best practices. 

I am trying to determine if we are seeing an excessive number of flow control frames.   We received over 12,000 PAUSE frames from the FEX within a 10 minute period which sounds excessive to me.   Can anybody confirm that 12,000 is an excessive number of PAUSE frames that would likely result in connectivity issues?

Here are statistics for one of our 10g interfaces during a recent 10 minute interval.  Notice that the NetApp is receiving PAUSE frames (RECEIVE-Xoff) from the FEX and never transmitting (TRANSMIT-Xoff).

 

RECEIVE

 Jabber:              0  | Bus overruns:        0  | Xon:             12412

 Xoff:            12412  | Jumbo:               0

TRANSMIT

 Queue overflows:     0  | No buffers:          0  | Xon:                 0

 Xoff:                0  | Jumbo:               0  | TSO non-TCP drop:    0

 

What I see is the FEX telling the NetApp to slow down.  I just dont know if the FEX is slowing down the NetApp enough that it would cause problems.

 

Have a great day,
Phil