It should be nearly painless to turn off flow-control (both send/recevice) on the filers and the switches.
If the flow control gets triggered on the switch, it could (and likely will) propoate to the NetApp and the interface will basically stop until it is told to start again when it receives the right info from the switch.

I know there may be other bugs out there, but (if you are using 10 GigE), it is certainly worth a shot to turn off all flow control.
It is the most current best practice.

--tmac

Tim McCarthy
Principal Consultant

          

        Clustered ONTAP                                                        Clustered ONTAP
 NCDA ID: XK7R3GEKC1QQ2LVD           RHCE6 110-107-141           NCSIE ID: C14QPHE21FR4YWD4
     Expires: 08 November 2014              Current until Aug 02, 2016         Expires: 08 November 2014



On Fri, Mar 28, 2014 at 10:52 AM, Mark Flint <mf1@sanger.ac.uk> wrote:
We’re not using DNFS…….or PNFS :)


Mark Flint
mf1@sanger.ac.uk



On 28 Mar 2014, at 13:15, Steiner, Jeffrey <Jeffrey.Steiner@netapp.com> wrote:

> That messages guarantees that NFS is flushing unacknowledged NFS operations, the only question is why. If you're not having frequently power failures of your database servers (I hope that's a safe assumption!) then you're almost certainly hitting the known DNFS issue.
>
> I strongly recommend getting to 11.2.0.4 if you're using DNFS. It's got a deadlock issue where you'll see these nfsd.tcp.close.idle warnings frequently, usually with stalls in IO that can last a couple minutes. I can't think of any risk of upgrading to 10Gb. In addition, I would recommend patching ONTAP up to 7.3.7P2 in order to get an ONTAP patch related to NFS flow control.
>
> If all you're seeing is latency spikes, that's probably a different issue. These NFS flow control messages are usually associated with total hangs that last up to 2 minutes, although not usually that bad.
>
> Don't let this scare you away from DNFS, though. The bugs in question existed for many years any nobody noticed until recently. They're extremely rare.
>
> -----Original Message-----
> From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Martin
> Sent: Friday, March 28, 2014 2:06 PM
> To: toasters@teaparty.net
> Subject: RE: NFS fails?
>
> Interesting thread, I've got a similar situation with a 3140 with 7.3.6P2 connected to an Oracle host over 1GbE using NFS which showing spikes in latency on the host.  The Oracle host is showing dropped packets on its storage interface and I am seeing lots of messages logged like:
>
> Mon Mar 21 12:06:33 GMT [Filer1: nfsd.tcp.close.idle.notify:warning]:
> Shutting down idle connection to client (x.x.x.x) where transmit side flow control has been enabled. There are 131 outstanding replies queued on the transmit buffer. This socket is being closed from the deferred queue.
>
> My thought was the Oracle hosts interface is saturated and its not responding to the NFS acknowledgements in time and so the Netapp is dropping the NFS requests.
>
> The 1GbE interface is being upgraded on the Oracle host but one of my concerns is hitting bugs that have been fixed in later 8.1.x releases once we remove the bottleneck on the Oracle host.  Particularly the DNFS and load related bugs.  I then read your comment:
>
> "The only time I’ve seen this issue occur, other than an actual total failure of network connectivity, is with some Oracle DNFS bugs."
>
> Is it possible to confirm whether this is simply the Filer flushing unacknowledged NFS requests or if this is actually the DNFS bug?
>
>
>
> --
> View this message in context: http://network-appliance-toasters.10978.n7.nabble.com/NFS-fails-tp25611p25616.html
> Sent from the Network Appliance - Toasters mailing list archive at Nabble.com.
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters



--
 The Wellcome Trust Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters