I can tell you that it does work, and that we've never had a problem doing it.  However, per my last post, we've only done minor upgrades this way, never major ones.  For the majors we've always gotten the downtime and done them properly.

We were pretty curious as to what would happen the first time we did it, but we tested it first and didn't have any issues.  YMMV; I would always test on development systems first.

I agree with your last comment.  Most other cluster systems allow this to be done as a standard method of upgrade.  I'm not sure why NetApp doesn't make this a supported feature.  Most of our downtime is for ONTAP upgrades (major revisions) and this would virtually eliminate those outages.

Jeff Mery, MCP
National Instruments

-------------------------------------------------------------------------
"Allow me to extol the virtues of the Net Fairy, and of all the fantastic
dorks that make the nice packets go from here to there. Amen."
TB - Penny Arcade
-------------------------------------------------------------------------



Davin Milun <milun@cse.Buffalo.EDU>
Sent by: owner-toasters@mathworks.com

02/26/2004 03:21 AM

To
toasters <toasters@mathworks.com>
cc
Subject
Re: Cluster failover question





At 15:20, on Feb 25, 2004, jeff.mery@ni.com wrote:
> We've also done rolling upgrades of DOT as mentioned below on our old F760
> cluster.  Upgrade one head, fail over, reboot that head, giveback, upgrade
> the other head, fail over, reboot the second head, and giveback.  The
> first head will complain that it's not at the same DOT level as its
> partner, but we've never had a problem with data integrity.  This also
> only happens for a short period between the first giveback and the second
> failover.  I don't know if this is supported or not so as always, test if
> you can.
>
> One note, we've never done this for major version upgrades (i.e. 5.x to
> 6.x) this way.  IIRC we've only done minor version upgrades this way (i.e.
> 6.4.1 to 6.4.2 or 6.4.1P2 to 6.4.2).
>
> Jeff

Does this work? Is it safe?

We've always been told fairly explicitly by NetApp that this will not
work.  And have never tried it, for risk of ending up in an unstable or
inconsistent state.

I've always wished that I could do this - because with many other
devices, this is a standard method of using a clustered-pair of devices
to avoid a downtime, while doing an OS upgrade.  But NetApp has never
supported it.


Davin.
--
Davin Milun    E-mail:  milun@cse.Buffalo.EDU