On the hardware side, I can vouch for the others.  There's no problem working on the "failed" head while serving data from the other.  We've done it many times on our old hardware and plan to do it on our new F940 cluster when we need to.

We've also done rolling upgrades of DOT as mentioned below on our old F760 cluster.  Upgrade one head, fail over, reboot that head, giveback, upgrade the other head, fail over, reboot the second head, and giveback.  The first head will complain that it's not at the same DOT level as its partner, but we've never had a problem with data integrity.  This also only happens for a short period between the first giveback and the second failover.  I don't know if this is supported or not so as always, test if you can.

One note, we've never done this for major version upgrades (i.e. 5.x to 6.x) this way.  IIRC we've only done minor version upgrades this way (i.e. 6.4.1 to 6.4.2 or 6.4.1P2 to 6.4.2).

Jeff

-------------------------------------------------------------------------
"Allow me to extol the virtues of the Net Fairy, and of all the fantastic
dorks that make the nice packets go from here to there. Amen."
TB - Penny Arcade
-------------------------------------------------------------------------



Scott Miller <skottie@anim.dreamworks.com>
Sent by: owner-toasters@mathworks.com

02/24/2004 03:08 PM

To
Kerry Herschel <margaret.k.herschel@jpl.nasa.gov>
cc
Geoff Hardin <geoff.hardin@dalsemi.com>, toasters <toasters@mathworks.com>
Subject
Re: Cluster failover question





Kerry Herschel wrote:

> Can you upgrade one side and bring it up running a different DOT version
> from it's partner? And then have it takeover the partner, then down the
> partner to upgrade it?

I always upgrade DOT to the same version on both heads of
a cluster at the same time, and reboot them both a the same time.
depending on your config, a DOT upgrade causes a 90 - 180 second
outage for the reboot.

 -skottie