From what I read about the Cluster, all connections survive. NFS, UDP and TCP, is stateless and can therefore easily re-establish itself. And the Cluster software syncs CIFS state information between the two filers so after a failover the remaining filer just picks up where the first left off. If this wasn't the case, why pay for the automatic failover...?
jason
In message 199809182111.OAA03435@tooting.netapp.com, Guy Harris writes:
It looks like a failover will be about as disruptive as a reboot, i.e., NFS mounts via udp will survive,
NFS mounts via TCP, if you have NFS-over-TCP enabled, should also survive, assuming your NFS-over-TCP client isn't broken - NFS-over-TCP clients shouldn't assume that TCP connections persist forever, they should be able to re-establish them as necessary (especially given that I know of three NFS servers, off the top of my head, that will close idle NFS-over-TCP connections:
NetApp
Digital UNIX
Solaris 2.x
).
but all CIFS connections will be lost (on the failed filer).
In *some* situations, this doesn't cause a problem - at least some of Microsoft's CIFS clients will re-establish connections - but if there are any open files on the connection, there's too much state on the server for Microsoft's clients to recover (I don't know if they have plans to change that - it can be tricky, given that the client might've had the file open for exclusive use and that you don't want somebody else to be able to sneak in and open the file before the client in question has a chance to reopen it).
I.e., CIFS sessions *might* survive a takeover or giveback, but, if one does, consider yourself lucky - don't depend on it.
Presumably the healthy filer will not have its service interrupted.
It shouldn't.
Once the failed hardware is repaired, I'm pretty sure that going back to normal operation requires rebooting the healthy filer,
Nope.
because it has to "offline" the volumes that it took over and that requires a reboot.
No, it just has to give them back to the filer from which it took them; that, unlike taking the volumes offline, doesn't require a reboot. (I don't *think* the first CF release will lift the restriction that you can't take a volume offline without doing a reboot, but it might; we don't view that as a permanent feature, we want to lift that restriction eventually.)
I don't know if hot spares can be left unassigned, so it may be necessary for each filer to have its own hot spare(s) assigned to it.
Hot spares are, as I remember, owned by particular members of the CF pair.
(No, I don't know whether the people here who dubbed it "CF" realized that "CF" could stand for something else. :-))