Re: clarification about reboots during NDU

List overview All Threads
Download

newer

older

any licensing issues NDU 7.3.5.1P2...

clarification about reboots during...

Fletcher Cocquyt

2 Feb 2012 2 Feb '12

7:24 p.m.

Justin, thanks for the feedback

What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A

so does node A reboot twice, then when you repeat the procedure, node B reboots twice?

thanks

-- Fletcher On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote: > In a takeover/giveback scenario, there are two reboots. > > The first reboot is the takeover. The second reboot is the giveback. > > From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt > Sent: Thursday, February 02, 2012 1:52 PM > To: toasters@teaparty.net > Subject: clarification about reboots during NDU > > Hi - can someone clarify this from pg 56 of the upgrade guide - > > for instance, if step 12 causes A to "reboot the system using the new firmware and software" > > WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?" > > Are there really 2 reboots ? > > thanks > > > 12. Enter the following command to reboot the system using the new firmware and software: > > bye > > 13. Choose the option that describes your configuration. > > <image001.png> > <image002.png> > If FCP or iSCSI... > > Is not in use in system A > > Is in use in system A > > Then when the "Waiting for giveback" message appears on the console of system A... > > Enter the following command at the console of system B: > > cf giveback > > Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B: > > cf giveback > > <image001.png> > <image002.png> > <image001.png> > <image002.png> > <image001.png> > <image002.png> > Attention: The cf giveback command can fail because of open client sessions (such as CIFS sessions), long-running operations, or operations that cannot be restarted (such as tape backup or SyncMirror resynchronization). If the cf givebackcommand fails, terminate any CIFS session or long-running operations gracefully (because the -f option will immediately terminate any CIFS sessions or long-running operations) and then enter the following command (with the -f option): > > cf giveback -f > For more information about the behavior of the -f option, see the cf(1) man page. > > The command causes system A to reboot with the new system configuration—a Data ONTAP version and any new system firmware and hardware changes—and resume normal operation as a high-availability partner. > > -- > Fletcher Cocquyt > Principal Engineer > Information Resources and Technology (IRT) > Stanford University School of Medicine > <image003.jpg> > > Email: fcocquyt@stanford.edu > Phone: (650) 724-7485 > > > >

Attachments:

attachment.html (text/html — 20.2 KB)

Show replies by date

Borzenkov, Andrey

3 Feb 3 Feb

2:45 a.m.

New subject: clarification about reboots during NDU

"cf giveback" by itself does not cause extra reboot. Statement is manuals is confusing.

There could be multiple reboots of A indeed after "cf takeover" to update various firmware. This is entirely transparent as HA pair remains in "A taken over by B" state until A boots far enough to declare itself "ready for giveback".

________________________________________ From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt [fcocquyt@stanford.edu] Sent: Thursday, February 02, 2012 23:24 To: Parisi, Justin Cc: toasters@teaparty.net Subject: Re: clarification about reboots during NDU

Justin, thanks for the feedback

What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A

so does node A reboot twice, then when you repeat the procedure, node B reboots twice?

thanks -- Fletcher

On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote:

In a takeover/giveback scenario, there are two reboots.

The first reboot is the takeover. The second reboot is the giveback.

From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt Sent: Thursday, February 02, 2012 1:52 PM To: toasters@teaparty.netmailto:toasters@teaparty.net Subject: clarification about reboots during NDU

Hi - can someone clarify this from pg 56 of the upgrade guide -

for instance, if step 12 causes A to "reboot the system using the new firmware and software"

WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?"

Are there really 2 reboots ?

thanks

12. Enter the following command to reboot the system using the new firmware and software:

bye

13. Choose the option that describes your configuration.

<image001.png> <image002.png>

If FCP or iSCSI...

Is not in use in system A

Is in use in system A

Then when the "Waiting for giveback" message appears on the console of system A...

Enter the following command at the console of system B:

cf giveback

Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B:

cf giveback

<image001.png> <image002.png> <image001.png> <image002.png> <image001.png> <image002.png>

Attention: The cf giveback command can fail because of open client sessions (such as CIFS sessions), long-running operations, or operations that cannot be restarted (such as tape backup or SyncMirror resynchronization). If the cf givebackcommand fails, terminate any CIFS session or long-running operations gracefully (because the -f option will immediately terminate any CIFS sessions or long-running operations) and then enter the following command (with the -f option):

cf giveback -f

For more information about the behavior of the -f option, see the cf(1) man page.

The command causes system A to reboot with the new system configuration—a Data ONTAP version and any new system firmware and hardware changes—and resume normal operation as a high-availability partner.

-- Fletcher Cocquyt Principal Engineer Information Resources and Technology (IRT) Stanford University School of Medicine <image003.jpg>

Email: fcocquyt@stanford.edux-msg://993/fcocquyt@stanford.edu Phone: (650) 724-7485

Fletcher Cocquyt

3:11 a.m.

New subject: clarification about reboots during NDU

Andrey - yes "confusing/ambiguous/hard to decipher" are all adjectives I'd use. I find the Upgrade Advisor is much clearer, but it lacks the details for NDU - it merely refers to "if you are performing NDU…"

As it stands, the admin is left to piece together the upgrade steps from several different sources I just finished the disk firmware upgrades on the first node recommended by Upgrade Advisor - but I made sure to call Netapp support specifically about the X410_HVIPC288A15 NA01 firmware and re-confirm it would be non-disruptive

Just copied X410_HVIPC288A15 NA01 to the second node's /etc/disk_fw - but its not initiating any firmware updates (first node started logging firmware update messages immediately)

How do I list the firmware of the X410_HVIPC288A15 disks?

disk show -T is not a valid option apparently any more in 7.3.5.1?

thanks

-- Fletcher On Feb 2, 2012, at 6:45 PM, Borzenkov, Andrey wrote: > "cf giveback" by itself does not cause extra reboot. Statement is manuals is confusing. > > There could be multiple reboots of A indeed after "cf takeover" to update various firmware. This is entirely transparent as HA pair remains in "A taken over by B" state until A boots far enough to declare itself "ready for giveback". > > ________________________________________ > From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt [fcocquyt@stanford.edu] > Sent: Thursday, February 02, 2012 23:24 > To: Parisi, Justin > Cc: toasters@teaparty.net > Subject: Re: clarification about reboots during NDU > > Justin, thanks for the feedback > > What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A > > so does node A reboot twice, then when you repeat the procedure, node B reboots twice? > > thanks > -- > Fletcher > > > > On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote: > > In a takeover/giveback scenario, there are two reboots. > > The first reboot is the takeover. The second reboot is the giveback. > > From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt > Sent: Thursday, February 02, 2012 1:52 PM > To: toasters@teaparty.netmailto:toasters@teaparty.net > Subject: clarification about reboots during NDU > > > Hi - can someone clarify this from pg 56 of the upgrade guide - > > for instance, if step 12 causes A to "reboot the system using the new firmware and software" > > WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?" > > Are there really 2 reboots ? > > thanks > > > > 12. Enter the following command to reboot the system using the new firmware and software: > > bye > > 13. Choose the option that describes your configuration. > > <image001.png> > <image002.png> > > If FCP or iSCSI... > > Is not in use in system A > > Is in use in system A > > Then when the "Waiting for giveback" message appears on the console of system A... > > Enter the following command at the console of system B: > > cf giveback > > Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B: > > cf giveback > > <image001.png> > <image002.png> > <image001.png> > <image002.png> > <image001.png> > <image002.png> > > Attention: The cf giveback command can fail because of open client sessions (such as CIFS sessions), long-running operations, or operations that cannot be restarted (such as tape backup or SyncMirror resynchronization). If the cf givebackcommand fails, terminate any CIFS session or long-running operations gracefully (because the -f option will immediately terminate any CIFS sessions or long-running operations) and then enter the following command (with the -f option): > > cf giveback -f > > For more information about the behavior of the -f option, see the cf(1) man page. > > The command causes system A to reboot with the new system configuration—a Data ONTAP version and any new system firmware and hardware changes—and resume normal operation as a high-availability partner. > > -- > Fletcher Cocquyt > Principal Engineer > Information Resources and Technology (IRT) > Stanford University School of Medicine > <image003.jpg> > > Email: fcocquyt@stanford.edux-msg://993/fcocquyt@stanford.edu > Phone: (650) 724-7485 > > > > >

Borzenkov, Andrey

3:30 a.m.

New subject: clarification about reboots during NDU

Use "sysconfig -v" or "storage show disk -x" to check current disk firmware. ________________________________________ From: Fletcher Cocquyt [fcocquyt@stanford.edu] Sent: Friday, February 03, 2012 07:11 To: Borzenkov, Andrey Cc: Parisi, Justin; toasters@teaparty.net Subject: Re: clarification about reboots during NDU

Just copied X410_HVIPC288A15 NA01 to the second node's /etc/disk_fw - but its not initiating any firmware updates (first node started logging firmware update messages immediately)

How do I list the firmware of the X410_HVIPC288A15 disks?

disk show -T is not a valid option apparently any more in 7.3.5.1?

thanks -- Fletcher

On Feb 2, 2012, at 6:45 PM, Borzenkov, Andrey wrote:

"cf giveback" by itself does not cause extra reboot. Statement is manuals is confusing.

________________________________________ From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt [fcocquyt@stanford.edumailto:fcocquyt@stanford.edu] Sent: Thursday, February 02, 2012 23:24 To: Parisi, Justin Cc: toasters@teaparty.netmailto:toasters@teaparty.net Subject: Re: clarification about reboots during NDU

Justin, thanks for the feedback

What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A

so does node A reboot twice, then when you repeat the procedure, node B reboots twice?

thanks -- Fletcher

On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote:

In a takeover/giveback scenario, there are two reboots.

The first reboot is the takeover. The second reboot is the giveback.

From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net mailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt Sent: Thursday, February 02, 2012 1:52 PM To: toasters@teaparty.netmailto:toasters@teaparty.net mailto:toasters@teaparty.net Subject: clarification about reboots during NDU

Hi - can someone clarify this from pg 56 of the upgrade guide -

for instance, if step 12 causes A to "reboot the system using the new firmware and software"

WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?"

Are there really 2 reboots ?

thanks

12. Enter the following command to reboot the system using the new firmware and software:

bye

13. Choose the option that describes your configuration.

<image001.png> <image002.png>

If FCP or iSCSI...

Is not in use in system A

Is in use in system A

Then when the "Waiting for giveback" message appears on the console of system A...

Enter the following command at the console of system B:

cf giveback

Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B:

cf giveback

<image001.png> <image002.png> <image001.png> <image002.png> <image001.png> <image002.png>

cf giveback -f

For more information about the behavior of the -f option, see the cf(1) man page.

-- Fletcher Cocquyt Principal Engineer Information Resources and Technology (IRT) Stanford University School of Medicine <image003.jpg>

Email: fcocquyt@stanford.edumailto:fcocquyt@stanford.edu x-msg://993/fcocquyt@stanford.edu Phone: (650) 724-7485

Jack Lyons

4:48 a.m.

New subject: clarification about reboots during NDU

If disks are available to both heads - the second head doesn't need to update any firmware. Sent from my Verizon Wireless BlackBerry

-----Original Message----- From: Fletcher Cocquyt fcocquyt@stanford.edu Sender: toasters-bounces@teaparty.net Date: Thu, 2 Feb 2012 19:11:38 To: Borzenkov, Andreyandrey.borzenkov@ts.fujitsu.com Cc: toasters@teaparty.nettoasters@teaparty.net Subject: Re: clarification about reboots during NDU

_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Fletcher Cocquyt

7:03 a.m.

New subject: Disk firmware and Disk Qualification

Jack - thanks for the feedback - Netapp support said it was more about ownership vs Multi-pathing and said to copy the firmware to both nodes /etc/disk_fw Also the Upgrade Adviser said 95 disks would be updated by this firmware and syslog only recorded 48

storage show disk -x does not show any remaining old X410_HVIPC288A15 NA00 firmware

But Netapp support also recommended I do the disk qualification http://now.netapp.com/NOW/download/tools/diskqual/ since not all disks may be "reported correctly" I found my /etc/qual_devices_v3 file to be from # Datecode: 20080725 but BALKED at updating it on a production system due to the warning in the docs:

Caution: Do not modify the contents of any of the Disk Qualification Package because you can bring your storage system to a halt.

HALT !? All these docs make the updates feel risky and not worth it The doc actually says to mount /etc and untar the archive directly to the running system! Given the dire HALT warning, any *nix admin would tell you this is bad practice. Better to untar to local disk verify checksum and cp into place…

Will see what the updated auto support says about the firmware now via Upgrade Advisor…

-- Fletcher On Feb 2, 2012, at 8:48 PM, Jack Lyons wrote: > If disks are available to both heads - the second head doesn't need to update any firmware. > Sent from my Verizon Wireless BlackBerry > > -----Original Message----- > From: Fletcher Cocquyt fcocquyt@stanford.edu > Sender: toasters-bounces@teaparty.net > Date: Thu, 2 Feb 2012 19:11:38 > To: Borzenkov, Andreyandrey.borzenkov@ts.fujitsu.com > Cc: toasters@teaparty.nettoasters@teaparty.net > Subject: Re: clarification about reboots during NDU > > _______________________________________________ > Toasters mailing list > Toasters@teaparty.net > http://www.teaparty.net/mailman/listinfo/toasters > >

tmac

10:04 a.m.

New subject: Disk firmware and Disk Qualification

That is meant to scare those individuals who have placed non-NetApp disks into their systems.

The OS looks at its' built in qualification table plus that file. If it finds disks that it does not like, well, then the system may just halt.

If all your disks are 100% netapp supported disks, you will have no issues.

--tmac Tim McCarthy Principal Consultant

RedHat Certified Engineer 804006984323821 (RHEL4) 805007643429572 (RHEL5)

On Fri, Feb 3, 2012 at 2:03 AM, Fletcher Cocquyt fcocquyt@stanford.edu wrote:

...

Jack - thanks for the feedback - Netapp support said it was more about ownership vs Multi-pathing and said to copy the firmware to both nodes /etc/disk_fw Also the Upgrade Adviser said 95 disks would be updated by this firmware and syslog only recorded 48

storage show disk -x does not show any remaining old X410_HVIPC288A15 NA00 firmware

But Netapp support also recommended I do the disk qualification http://now.netapp.com/NOW/download/tools/diskqual/ since not all disks may be "reported correctly" I found my /etc/qual_devices_v3 file to be from # Datecode: 20080725 but BALKED at updating it on a production system due to the warning in the docs:

Caution: Do not modify the contents of any of the Disk Qualification Package because you can bring your storage system to a halt.

HALT !? All these docs make the updates feel risky and not worth it The doc actually says to mount /etc and untar the archive directly to the running system! Given the dire HALT warning, any *nix admin would tell you this is bad practice. Better to untar to local disk verify checksum and cp into place…

Will see what the updated auto support says about the firmware now via Upgrade Advisor… -- Fletcher

On Feb 2, 2012, at 8:48 PM, Jack Lyons wrote:

If disks are available to both heads - the second head doesn't need to update any firmware. Sent from my Verizon Wireless BlackBerry

-----Original Message----- From: Fletcher Cocquyt fcocquyt@stanford.edu Sender: toasters-bounces@teaparty.net Date: Thu, 2 Feb 2012 19:11:38 To: Borzenkov, Andreyandrey.borzenkov@ts.fujitsu.com Cc: toasters@teaparty.nettoasters@teaparty.net Subject: Re: clarification about reboots during NDU

Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Steve

3:13 a.m.

New subject: clarification about reboots during NDU

...

From a cifs perspective a give back is a reboot.

-- We are the 99% "Borzenkov, Andrey" andrey.borzenkov@ts.fujitsu.com wrote: "cf giveback" by itself does not cause extra reboot. Statement is manuals is confusing. There could be multiple reboots of A indeed after "cf takeover" to update various firmware. This is entirely transparent as HA pair remains in "A taken over by B" state until A boots far enough to declare itself "ready for giveback". _____________________________________________ From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt [fcocquyt@stanford.edu] Sent: Thursday, February 02, 2012 23:24 To: Parisi, Justin Cc: toasters@teaparty.net Subject: Re: clarification about reboots during NDU Justin, thanks for the feedback What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A so does node A reboot twice, then when you repeat the procedure, node B reboots twice? thanks -- Fletcher On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote: In a takeover/giveback scenario, there are two reboots. The first reboot is the takeover. The second reboot is the giveback. From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt Sent: Thursday, February 02, 2012 1:52 PM To: toasters@teaparty.netmailto:toasters@teaparty.net Subject: clarification about reboots during NDU Hi - can someone clarify this from pg 56 of the upgrade guide - for instance, if step 12 causes A to "reboot the system using the new firmware and software" WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?" Are there really 2 reboots ? thanks 12. Enter the following command to reboot the system using the new firmware and software: bye 13. Choose the option that describes your configuration. <image001.png> <image002.png> If FCP or iSCSI... Is not in use in system A Is in use in system A Then when the "Waiting for giveback" message appears on the console of system A... Enter the following command at the console of system B: cf giveback Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B: cf giveback <image001.png> <image002.png> <image001.png> <image002.png> <image001.png> <image002.png> Attention: The cf giveback command can fail because of open client sessions (such as CIFS sessions), long-running operations, or operations that cannot be restarted (such as tape backup or SyncMirror resynchronization). If the cf givebackcommand fails, terminate any CIFS session or long-running operations gracefully (because the -f option will immediately terminate any CIFS sessions or long-running operations) and then enter the following command (with the -f option): cf giveback -f For more information about the behavior of the -f option, see the cf(1) man page. The command causes system A to reboot with the new system configuration—a Data ONTAP version and any new system firmware and hardware changes—and resume normal operation as a high-availability partner. -- Fletcher Cocquyt Principal Engineer Information Resources and Technology (IRT) Stanford University School of Medicine <image003.jpg> Email: fcocquyt@stanford.edux-msg://993/fcocquyt@stanford.edu Phone: (650) 724-7485 _____________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Borzenkov, Andrey

3:26 a.m.

New subject: clarification about reboots during NDU

Yes, you are absolutely right, but to quote OP: "confusing/ambiguous/hard to decipher" are all adjectives I'd use.

"Reboot" is far too ambiguous in this context.

________________________________________ From: Steve [stever@up-south.com] Sent: Friday, February 03, 2012 07:13 To: Borzenkov, Andrey; Fletcher Cocquyt; Parisi, Justin Cc: toasters@teaparty.net Subject: RE: clarification about reboots during NDU

...

From a cifs perspective a give back is a reboot.

-- We are the 99%

"Borzenkov, Andrey" andrey.borzenkov@ts.fujitsu.com wrote:

"cf giveback" by itself does not cause extra reboot. Statement is manuals is confusing.

________________________________

From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Fletcher Cocquyt [fcocquyt@stanford.edu] Sent: Thursday, February 02, 2012 23:24 To: Parisi, Justin Cc: toasters@teaparty.net Subject: Re: clarification about reboots during NDU

Justin, thanks for the feedback

What makes step 13 (cf giveback) triggering another reboot of the SAME node A is step 12 reads like that has already been accomplished for node A

so does node A reboot twice, then when you repeat the procedure, node B reboots twice?

thanks -- Fletcher

On Feb 2, 2012, at 11:20 AM, Parisi, Justin wrote:

In a takeover/giveback scenario, there are two reboots.

The first reboot is the takeover. The second reboot is the giveback.

Hi - can someone clarify this from pg 56 of the upgrade guide -

for instance, if step 12 causes A to "reboot the system using the new firmware and software"

WHY in step 13 does the cf giveback ALSO "cause system A to reboot with the new system configuration?"

Are there really 2 reboots ?

thanks

12. Enter the following command to reboot the system using the new firmware and software:

bye

13. Choose the option that describes your configuration.

<image001.png> <image002.png>

If FCP or iSCSI...

Is not in use in system A

Is in use in system A

Then when the "Waiting for giveback" message appears on the console of system A...

Enter the following command at the console of system B:

cf giveback

Wait for at least eight minutes to allow host multipathing software to stabilize, then enter the following command at the console of system B:

cf giveback

<image001.png> <image002.png> <image001.png> <image002.png> <image001.png> <image002.png>

cf giveback -f

For more information about the behavior of the -f option, see the cf(1) man page.

-- Fletcher Cocquyt Principal Engineer Information Resources and Technology (IRT) Stanford University School of Medicine <image003.jpg>

Email: fcocquyt@stanford.edux-msg://993/fcocquyt@stanford.edu Phone: (650) 724-7485

________________________________

Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

5178

Age (days ago)

5179

Last active (days ago)

toasters@lists.teaparty.net

8 comments

5 participants

tags (0)

participants (5)

Borzenkov, Andrey
Fletcher Cocquyt
Jack Lyons
Steve
tmac