Hi folks,
After a takeover, we find that some (though not all) LUNs allocated to Microsoft Clustering clients show up in the Microsoft GUI as 'offline' or 'failed'. Rebooting the clients doesn't help. When we disable the Cluster disk drivers (make the client a simple iSCSI client), we can connect, read & write, etc. On the filer side, no sign of a reservation. We've replicated this behavior on three occasions over the last year (every time we experience a takeover, either administratively induced or accident induced).
The clients run SnapDrive & SnapManager for SQL Server. To resolve this, we copy our data off (while connected via the vanilla iSCSI client), destroy the LUNs, rebuild, copy data back. Alternatively, we can wait a few days (not sure how many), whereupon the issue resolves itself (we've seen this once ... test environment).
Anyone else seen this? Insights? What might be happening?
NetApp v3140 running 7.3.5.1P5 Spinning rust provided by 3Par T800 Clients running Win2008R2
--sk
Stuart Kendrick FHCRC