"Philbert" == Philbert Rupkins philbertrupkins@gmail.com writes:
Philbert> Thanks for the info. I'm familiar with the vetoed giveback Philbert> due to CIFS - we hit that during unplanned failover events. Philbert> Good to know I can expect that during upgrades as well.
I did an upgrade (see my questions from Jan/Feb time) of 8.3 to 9.3 going through 9.1 and it was smooth sailing from the CLI. Super nice and easy. I really liked how well the upgrade process works now as compared to the old 8.1 -> 8.3 cDOT upgrade I did, as well as other 7-mode upgrades in the past.
I'm a CLI guy (heh, nearly wrote gui there) so I just do it from a screen session inside xterm and keep alot of history. We did a big ESX hardware upgrade at the same time, so all my main production loads were shutdown, but honestly, OnTap is so rock solid for regular NFS and even CIFS loads that I'd be ballsy and just go for it.
It all depends on your management's comfort level. Can you show them that failovers and pretty much transparent today to give them confidence?
Which brings me to my big rant, which is failure testing. Too many sites/people are scared to do testing, or make any changes. If you have a robust system, which you expect to be HA, then you need to *test* it to be sure, and to make sure you know the right proceedures in case of problems.
Otherwise, you don't know and can't trust your setup. Which is why I really love the Netflix Simian Army stuff. I just wish I could get more of the team I work with to understand this idea. Test for failures under realistic conditions or you won't know.
Philbert> Are you initiating the upgrade from the GUI? Also, when you Philbert> override the CIFS veto, do you then need to issue a "cluster Philbert> image resume-update" or resume from the GUI somewhere?
Philbert> On Thu, Jul 9, 2020 at 3:37 PM Scott Eno cse@hey.com wrote:
Really like the automated myself. So much better than the old 7-mode days.
Only issue I repeatedly hit is on giveback, aggr giveback will get vetoed due to CIFS sessions. Never understood why it's fine to break CIFS sessions on takeover, but everything comes to a halt on giveback.
Have to go to CLI and force aggr giveback with override-veto switch.
Philbert Rupkins philbertrupkins@gmail.com wrote:
Toasters,
What's your preference for non-disruptively upgrading a switch based ONTAP 9 cluster - automated NDU or manual (rolling) NDU?
Happy to hear of both positive and negative experiences, if any.
The cluster in question consists of 3 HA pairs so the automated upgrade will default to rolling. The general recommendation is to use the automated procedure but there are concerns about lack of control, especially in the event of issues. Each HA pair in the cluster hosts critical prod workloads.
No access to a test cluster so there isn't much opportunity to build confidence in the automated procedure ahead of time. I am aware of the ability to pause the automated upgrade.
Leaning toward manual at the moment due to lack of exposure to the automated process.
Cheers, Phil _______________________________________________ Toasters mailing list Toasters@teaparty.net https://www.teaparty.net/mailman/listinfo/toasters
Philbert> _______________________________________________ Philbert> Toasters mailing list Philbert> Toasters@teaparty.net Philbert> https://www.teaparty.net/mailman/listinfo/toasters