If you are running a virtual environment such as vmware, you want to make sure you are using the timeout advanced options settings specified in the vmware on netapp best practice document (or whatever is applicable to your environment)
Also, if you’re taking a hit of more than 30-45 seconds to NFS on giveback/failover, I would wonder what the load on the storage system during the failover. I always stop SIS and snapmiror , disk scrubs, or anything else I can to get load down as low as possible. Preferably, each head is running at 30-40% or less before the failover to minimize impact. If I have the luxury, I usually watch the performance manager i/o trends and time the takeover/giveback in a trough after coming down from a peak.
With cifs this all goes out the window as you all know, cifs is terminated prior to giveback and then restarted.
--JMS
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley
Sent: Wednesday, May 29, 2013 12:14 PM
To: Sebastian Goetze
Cc: Toasters@teaparty.net
Subject: Re: Netapp ONTAP 8.0.2 - NFS takeover/giveback - should it be visible to clients?
Sebastian,
My definition of non-disruptive is "nobody notices". When I start getting visits from engineers or support calls saying the network is down or behaving badly, it qualifies as disruptive to our lab :-) That's why I get to do all my cluster work or snapmirror migrations in the early morning hours :-)
JeffOn Wed, May 29, 2013 at 1:53 AM, Sebastian Goetze <spgoetze@gmail.com> wrote:
Hi Jeff,
the non-disruptiveness also depends upon what protocol version you're using:
Version 3 doesn't have sessions, so there's no session infos to be handed over to the partner...
Versions 4/4.1 have sessions, but are supposed to handle reboots/takeovers more gracefully, also regarding locks.
Generally speaking "non-disruptive" means (not only in NetApp speak), that the disruption is less than the protocol timeout values (and there's no need to re-start a session). It does not mean, that there isn't a small period of non-responsiveness. Even SAN-protocols 'suffer' the same, e.g. when there's a path state change and ALUA/MPIO kicks in...
my 2c
Sebastian
On 29.05.2013 09:24, Jeff Cleverley wrote:
Rafal,
NetApp's definition of "non-disruptive" is generally not what most people consider non-disruptive :-) It does cause nfs outages during these modes. How much time varies by hardware, system load, etc. The time listed by the giveback is generally less than user wall clock time. The system may say the downtime was 45 seconds, but a time command from a client will usually be closer to 100 seconds before the filer starts responding again. That is why I generally do snapmirror migrations and cluster failovers in the middle of the night to minimize user complaints.
JeffOn Wed, May 29, 2013 at 1:07 AM, Rafał Radecki <radecki.rafal@gmail.com> wrote:
Hi All.
I have two FAS3210 in 7-mode Actvie/Active. I perform takeover/gieback and see that the outage is visible for clients for nfs mounts. From your experiences should it be so? I think that information about nfs sessions and IP address are moved to the takeover node so in theory it should not be visible.
Best regards,
Rafal.
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
--
Jeff Cleverley
Unix Systems Administrator
4380 Ziegler Road
Fort Collins, Colorado 80525
970-288-4611
_______________________________________________Toasters mailing listToasters@teaparty.nethttp://www.teaparty.net/mailman/listinfo/toasters
--
Jeff Cleverley
Unix Systems Administrator
4380 Ziegler Road
Fort Collins, Colorado 80525
970-288-4611
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters