Hi folks,
After a takeover, we find that some (though not all) LUNs allocated to Microsoft Clustering clients show up in the Microsoft GUI as 'offline' or 'failed'. Rebooting the clients doesn't help. When we disable the Cluster disk drivers (make the client a simple iSCSI client), we can connect, read & write, etc. On the filer side, no sign of a reservation. We've replicated this behavior on three occasions over the last year (every time we experience a takeover, either administratively induced or accident induced).
The clients run SnapDrive & SnapManager for SQL Server. To resolve this, we copy our data off (while connected via the vanilla iSCSI client), destroy the LUNs, rebuild, copy data back. Alternatively, we can wait a few days (not sure how many), whereupon the issue resolves itself (we've seen this once ... test environment).
Anyone else seen this? Insights? What might be happening?
NetApp v3140 running 7.3.5.1P5 Spinning rust provided by 3Par T800 Clients running Win2008R2
--sk
Stuart Kendrick FHCRC
My 1st guess would have been fiber zoning but if some of the disks are showing up..., but maybe your using iSCSi. If your using fiber channel, make sure the zones have both filers in them...
A tool you can use is called syscompare. http://download.cnet.com/SysCompare/3000-2094_4-10911660.html.. It compares everything and lets you know what maybe different such as: missing drivers and or driver version diffs mismatch services Could be an issue with the version of snapdrive and iscsi initiator.
I know the Vseries had specific configurations and constraints. Maybe its worth opening a case, especially if you can reproduce it.
Date: Tue, 24 Jan 2012 05:05:07 -0800 From: skendric@fhcrc.org To: toasters@teaparty.net Subject: MS Cluster clients locked out of LUNs
Hi folks,
After a takeover, we find that some (though not all) LUNs allocated to Microsoft Clustering clients show up in the Microsoft GUI as 'offline' or 'failed'. Rebooting the clients doesn't help. When we disable the Cluster disk drivers (make the client a simple iSCSI client), we can connect, read & write, etc. On the filer side, no sign of a reservation. We've replicated this behavior on three occasions over the last year (every time we experience a takeover, either administratively induced or accident induced).
The clients run SnapDrive & SnapManager for SQL Server. To resolve this, we copy our data off (while connected via the vanilla iSCSI client), destroy the LUNs, rebuild, copy data back. Alternatively, we can wait a few days (not sure how many), whereupon the issue resolves itself (we've seen this once ... test environment).
Anyone else seen this? Insights? What might be happening?
NetApp v3140 running 7.3.5.1P5 Spinning rust provided by 3Par T800 Clients running Win2008R2
--sk
Stuart Kendrick FHCRC _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters