RE: NFS fails?

28 Mar 2014


      That messages guarantees that NFS is flushing unacknowledged NFS operations, the only question is why. If you're not having frequently power failures of your database servers (I hope that's a safe assumption!) then you're almost certainly hitting the known DNFS issue.
I strongly recommend getting to 11.2.0.4 if you're using DNFS. It's got a deadlock issue where you'll see these nfsd.tcp.close.idle warnings frequently, usually with stalls in IO that can last a couple minutes. I can't think of any risk of upgrading to 10Gb. In addition, I would recommend patching ONTAP up to 7.3.7P2 in order to get an ONTAP patch related to NFS flow control.
If all you're seeing is latency spikes, that's probably a different issue. These NFS flow control messages are usually associated with total hangs that last up to 2 minutes, although not usually that bad.
Don't let this scare you away from DNFS, though. The bugs in question existed for many years any nobody noticed until recently. They're extremely rare.
-----Original Message-----
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Martin
Sent: Friday, March 28, 2014 2:06 PM
To: toasters@teaparty.net
Subject: RE: NFS fails?
Interesting thread, I've got a similar situation with a 3140 with 7.3.6P2 connected to an Oracle host over 1GbE using NFS which showing spikes in latency on the host.  The Oracle host is showing dropped packets on its storage interface and I am seeing lots of messages logged like:
Mon Mar 21 12:06:33 GMT [Filer1: nfsd.tcp.close.idle.notify:warning]:
Shutting down idle connection to client (x.x.x.x) where transmit side flow control has been enabled. There are 131 outstanding replies queued on the transmit buffer. This socket is being closed from the deferred queue.
My thought was the Oracle hosts interface is saturated and its not responding to the NFS acknowledgements in time and so the Netapp is dropping the NFS requests.
The 1GbE interface is being upgraded on the Oracle host but one of my concerns is hitting bugs that have been fixed in later 8.1.x releases once we remove the bottleneck on the Oracle host.  Particularly the DNFS and load related bugs.  I then read your comment:
"The only time I’ve seen this issue occur, other than an actual total failure of network connectivity, is with some Oracle DNFS bugs."
Is it possible to confirm whether this is simply the Filer flushing unacknowledged NFS requests or if this is actually the DNFS bug?
--
View this message in context: http://network-appliance-toasters.10978.n7.nabble.com/NFS-fails-tp25611p2561...
Sent from the Network Appliance - Toasters mailing list archive at Nabble.com.
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

RE: NFS fails?