I assume you’re using VIF on the filer(s)?

Are all the LUNs on a single filer?  Multiple filers?

Are the filers using two different switches for their active interfaces within the iSCSI VIF?

Do you have multiple interfaces defined for iSCSI on the filer?

Are you using iSCSI multipath?  NetApp or MS?  iSCSI version?

 

A picture is worth 1000 words – there are so many possibilities here…

 

Could be bad switch trunking (clients connected to one switch, filer connected to another with no path), could be bad filer config (can you ping during failover?), could be lots of other things.

 


From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Cristian Rojas
Sent: Friday, November 21, 2008 9:36 PM
To: Jeff Mohler
Cc: Jack Lyons; toasters@mathworks.com
Subject: Re: cf takeover hungs Microsoft Windows Clusters

 

we have 2 x Cisco 3750 configured with portfast 1000 full duplex to avoid spanning tree negotiation.

Chris

 


From: Jeff Mohler <speedtoys.racing@gmail.com>
To: Cristian Rojas <intipu@yahoo.com>
Cc: Jack Lyons <jack1729@gmail.com>; "toasters@mathworks.com" <toasters@mathworks.com>
Sent: Saturday, November 22, 2008 2:21:08 AM
Subject: Re: cf takeover hungs Microsoft Windows Clusters

Two minutes eh?   Sounds a lot like ARP issues.  How is portfast/etc set on the switch ports?   

Sent from my iPhone


On Nov 21, 2008, at 5:33 PM, Cristian Rojas <intipu@yahoo.com> wrote:

Hi Jack, we are using 2 standalone interfaces for iscsi, and a team of 2 interfaces for intra and cluster traffic.

thanks

Chris

 


From: Jack Lyons <jack1729@gmail.com>
To: Cristian Rojas <intipu@yahoo.com>; toasters@mathworks.com
Sent: Saturday, November 22, 2008 1:21:00 AM
Subject: Re: cf takeover hungs Microsoft Windows Clusters

We have had several clusters that have worked fine over take overs
(just went through 6 panics after upgrading to 7.2.6!)., but since you
mention igroups - I assume you are using iscsi for the luns.
We are using fcp.

I assume you are using 3 seperate network interfaces for iscsi traffic
and the intra cluster and public traffic?

Jack



On 11/21/08, Cristian Rojas <intipu@yahoo.com> wrote:
> Hi guys, im seeing this behavior on different MS windows clusters attached
> to our FAS960, the most weird part is that i see the active node of the
> windows cluster connected to the target portal, but from the filer i dont
> see the igroup logged in for almost 2 minutes and thats when the windows
> clusters hung because they cant mount the quorum lun. All these while doiing
> a cf takeover.
>
> I talked to probably 10 Netapp support engineers and no one can explain it.
>
> Has anyone seen anything like this before? I have installed Host Utilities 5
> and checked the windows registry to make sure the values have been tuned.
>
> Thanks
>
> Chris
>
>
>
>

--
Sent from Gmail for mobile | mobile.google.com