6080 heads with 6040 NVRAM cards.

List overview All Threads
Download

newer

older

can't touch a CIFS folder?

OnTap 8 Cluster install

Jeff Cleverley

31 Jan 2013 31 Jan '13

1:58 a.m.

Greetings,

I'm thinking about doing something that is not supported and was wondering if anyone had done the same or has more detailed insight.

We have a very busy cluster (6040s 7.3.5.1P4). It looks like we are largely maxing out the heads for CPU. We are getting a pair of 6080s and really need to try and do the head swap live (takeover / giveback) if at all possible. The unsupported part I want to do is keep the 6040 NVRAM cards and put them in the 6080s as I swap them. The reason for this is I would not have to change the system ID ownership on all the drives.

I know changing the system ID is generally not a big deal by booting each head to maintenance mode and reassigning the old SID to the new SID. In our case it worries me. Last week we were going to move a project to the other head by reassigning the appropriate drives for a couple of aggregates. While trying to reassign these the SAS buses started panic'ing and crashed the controlling filer. The entire cluster was down. The ensuing mess took several hours to clean up.

If it crashed while trying to change ownership of a few drives, I'm afraid of what will happen when it tries to reassign all the old SID drives for the new NVRAM card. I was hoping if we could keep the cards, we could swap heads, not change SIDs, and minimize our chance of repeating the crash. I could do the disks one at a time, but I have 796 drives on this cluster and would rather not.

Is there a requirement for the hardware to have the bigger memory cards? Since there are more CPUs, I can see where maybe something needs it, I just don't know what. We will probably have a downtime in a couple of months where I can put the correct ones back in.

Thanks,

Jeff

-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611

Show replies by date

Felix Schröder

31 Jan 31 Jan

3:30 a.m.

Jeff,

Great point of view and also quite pragmatic. However, I would not do that and this has quite a few reasons.

The undocumented nature of the NVRAM content. Are you sure the NVRAM contains just the system id? I don't think so. I think it is quite possible that the NVRAM contains machine relevant content which will just dig you deeper into problems if it goes wrong.

I think your panicked filer is not an nvram issue, more like a software issue. Your Data ONTAP version is pretty ol, 7.3.5.* has a large amount of WTF bugs, depending on what SAS HBA you are using there are a few new driver built into new dot releases. Having a look into the bug database is always recommended, enjoy the obscurity of some ontap bugs. ;)

HTH, Felix -- Felix Schröder Support Engineer

teamix GmbH Suedwestpark 35 90449 Nuernberg

fon: +49 911 30999-68 fax: +49 911 30999-99 mail: fs@teamix.de web: http://www.teamix.de

Amtsgericht Nürnberg, HRB 18320 Geschäftsführer: Oliver Kügow, Richard Müller

----- Ursprüngliche Mail ----- Von: "Jeff Cleverley" jeff.cleverley@avagotech.com An: toasters@teaparty.net Gesendet: Donnerstag, 31. Januar 2013 02:58:00 Betreff: 6080 heads with 6040 NVRAM cards.

Greetings,

I'm thinking about doing something that is not supported and was wondering if anyone had done the same or has more detailed insight.

Thanks,

Jeff

-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611 _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

Jeff Cleverley

3:37 a.m.

Felix,

Thanks for the reply. I figure the cards can't have any host specific information on them since it would go away any time you had to replace one. I can see it having set aside storage blocks for something like that for the added CPUs, etc.

We have been considering upgrading to a newer OS, but our CPU loading went up between 30-50% when we upgrade to this on one of our filers. NetApp never could figure out what changed. Not knowing what caused it made looking for bugs quite difficult :-) We've been afraid of what might happen going to a new OS.

At some point we may get 6280s or similar which will force an OS upgrade. Unfortunately the cluster controller on the 6280 is SAS and not the infiniband connector so we could not do a hot swap with this model. Otherwise we might have thought about bigger heads this time around.

We most likely won't do this but it is always good to get as much information as possible :-)

Jeff

On Wed, Jan 30, 2013 at 8:30 PM, Felix Schröder fs@teamix.de wrote:

...

Jeff,

Great point of view and also quite pragmatic. However, I would not do that and this has quite a few reasons.

The undocumented nature of the NVRAM content. Are you sure the NVRAM contains just the system id? I don't think so. I think it is quite possible that the NVRAM contains machine relevant content which will just dig you deeper into problems if it goes wrong.

I think your panicked filer is not an nvram issue, more like a software issue. Your Data ONTAP version is pretty ol, 7.3.5.* has a large amount of WTF bugs, depending on what SAS HBA you are using there are a few new driver built into new dot releases. Having a look into the bug database is always recommended, enjoy the obscurity of some ontap bugs. ;)

HTH, Felix -- Felix Schröder Support Engineer

teamix GmbH Suedwestpark 35 90449 Nuernberg

fon: +49 911 30999-68 fax: +49 911 30999-99 mail: fs@teamix.de web: http://www.teamix.de

Amtsgericht Nürnberg, HRB 18320 Geschäftsführer: Oliver Kügow, Richard Müller

----- Ursprüngliche Mail ----- Von: "Jeff Cleverley" jeff.cleverley@avagotech.com An: toasters@teaparty.net Gesendet: Donnerstag, 31. Januar 2013 02:58:00 Betreff: 6080 heads with 6040 NVRAM cards.

Greetings,

I'm thinking about doing something that is not supported and was wondering if anyone had done the same or has more detailed insight.

We have a very busy cluster (6040s 7.3.5.1P4). It looks like we are largely maxing out the heads for CPU. We are getting a pair of 6080s and really need to try and do the head swap live (takeover / giveback) if at all possible. The unsupported part I want to do is keep the 6040 NVRAM cards and put them in the 6080s as I swap them. The reason for this is I would not have to change the system ID ownership on all the drives.

I know changing the system ID is generally not a big deal by booting each head to maintenance mode and reassigning the old SID to the new SID. In our case it worries me. Last week we were going to move a project to the other head by reassigning the appropriate drives for a couple of aggregates. While trying to reassign these the SAS buses started panic'ing and crashed the controlling filer. The entire cluster was down. The ensuing mess took several hours to clean up.

If it crashed while trying to change ownership of a few drives, I'm afraid of what will happen when it tries to reassign all the old SID drives for the new NVRAM card. I was hoping if we could keep the cards, we could swap heads, not change SIDs, and minimize our chance of repeating the crash. I could do the disks one at a time, but I have 796 drives on this cluster and would rather not.

Is there a requirement for the hardware to have the bigger memory cards? Since there are more CPUs, I can see where maybe something needs it, I just don't know what. We will probably have a downtime in a couple of months where I can put the correct ones back in.

Thanks,

Jeff

-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611 _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611

Pascal Dukers

6:32 a.m.

Hi,

You are forgetting something. The FAS6040 and FAS6080 use different type of NVRAM cards with different amount of memory (512 MB vs 2 GB).

Even if will work technically work, which I doubt, it would certainly be a totally unsupported configuration.

Pascal.

...

-----Original Message----- From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Thursday, January 31, 2013 2:58 AM To: toasters@teaparty.net Subject: 6080 heads with 6040 NVRAM cards.

Greetings,

I'm thinking about doing something that is not supported and was wondering if anyone had done the same or has more detailed insight.

We have a very busy cluster (6040s 7.3.5.1P4). It looks like we are largely maxing out the heads for CPU. We are getting a pair of 6080s and really need to try and do the head swap live (takeover / giveback) if at all possible. The unsupported part I want to do is keep the 6040 NVRAM cards and put them in the 6080s as I swap them. The reason for this is I would not have to change the system ID ownership on all the drives.

I know changing the system ID is generally not a big deal by booting each head to maintenance mode and reassigning the old SID to the new SID. In our case it worries me. Last week we were going to move a project to the other head by reassigning the appropriate drives for a couple of aggregates. While trying to reassign these the SAS buses started panic'ing and crashed the controlling filer. The entire cluster was down. The ensuing mess took several hours to clean up.

If it crashed while trying to change ownership of a few drives, I'm afraid of what will happen when it tries to reassign all the old SID drives for the new NVRAM card. I was hoping if we could keep the cards, we could swap heads, not change SIDs, and minimize our chance of repeating the crash. I could do the disks one at a time, but I have 796 drives on this cluster and would rather not.

Is there a requirement for the hardware to have the bigger memory cards? Since there are more CPUs, I can see where maybe something needs it, I just don't know what. We will probably have a downtime in a couple of months where I can put the correct ones back in.

Thanks,

Jeff

-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611 _______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

Klise, Steve

7:32 p.m.

New subject: nasty virus

I don't want to start up the thread about AV on or not on the filers (I run it on my filer), but we got hit with a nasty variant of this. Not sure why Trend didn't block it, but basically a bunch of my CIF folders were set to read only/hidden.. Not fun. There are some other reasons that need to be mitigated that I wont go into, but this is the virus.

Sebastian Goetze

7:10 a.m.

In addition to all the other things mentioned:

YOU DO NOT (and should not) GET INTO MAINTENANCE MODE while a takeover is taking place! Really bad idea! (unlocking mailbox disks and all that)

INSTEAD: While in Takeover Mode (on the node that has taken over) do a disk _re_assign. This will reassign *all* disks belonging _to the partner_ to a new sysid.

This should work non-disruptively.

As for your previous aggregate-reassign: How did you do it? I sure hope you

* turned auto assign off on both nodes * took the aggregate offline * _un_assigned from the owning node (disk assign xxx -s unowned -f) * assigned from the new node o look there: a new aggregate. should be the response of the new node * turn on auto-assign (if needed)

Anything else and I'm not surprised it panicked... Reassigning one-by-one (on online aggregates) leads to broken raid-groups and panics from that.

Hope that helped

Sebastian

On 31.01.2013 02:58, Jeff Cleverley wrote:

...

Greetings,

I'm thinking about doing something that is not supported and was wondering if anyone had done the same or has more detailed insight.

We have a very busy cluster (6040s 7.3.5.1P4). It looks like we are largely maxing out the heads for CPU. We are getting a pair of 6080s and really need to try and do the head swap live (takeover / giveback) if at all possible. The unsupported part I want to do is keep the 6040 NVRAM cards and put them in the 6080s as I swap them. The reason for this is I would not have to change the system ID ownership on all the drives.

I know changing the system ID is generally not a big deal by booting each head to maintenance mode and reassigning the old SID to the new SID. In our case it worries me. Last week we were going to move a project to the other head by reassigning the appropriate drives for a couple of aggregates. While trying to reassign these the SAS buses started panic'ing and crashed the controlling filer. The entire cluster was down. The ensuing mess took several hours to clean up.

If it crashed while trying to change ownership of a few drives, I'm afraid of what will happen when it tries to reassign all the old SID drives for the new NVRAM card. I was hoping if we could keep the cards, we could swap heads, not change SIDs, and minimize our chance of repeating the crash. I could do the disks one at a time, but I have 796 drives on this cluster and would rather not.

Is there a requirement for the hardware to have the bigger memory cards? Since there are more CPUs, I can see where maybe something needs it, I just don't know what. We will probably have a downtime in a couple of months where I can put the correct ones back in.

Thanks,

Jeff

Jeff Cleverley

12 Feb 12 Feb

7:29 a.m.

Greetings,

I thought I'd summarize what went on and pass along some key points that may help others in the future.

1. We used the 6080 NVRAM cards because nobody could tell us for sure if the 6040 cards would work properly. 2. We found you cannot live swap 6040 and 6080 heads in a cluster. When you takeover a 6040 and replace it with a 6080, things will work until you do a giveback to what is now the 6080. The 6040 head disables the cluster because of the different card in the 6080. You can force a takeover of the 6040 if you want to skip the message about it corrupting your data. 3. The 6080 heads previously booted 8.x images. We run 7.3.5.1. The system would not boot in any way and we could not get the compact flash cards to update after netboots and software installs. Ultimately we just swapped the compact flash cards from the 6040s and everything worked fine.

Thanks for all the help and information on this.

Jeff

On Thu, Jan 31, 2013 at 12:10 AM, Sebastian Goetze spgoetze@gmail.com wrote:

...

In addition to all the other things mentioned:

YOU DO NOT (and should not) GET INTO MAINTENANCE MODE while a takeover is taking place! Really bad idea! (unlocking mailbox disks and all that)

INSTEAD: While in Takeover Mode (on the node that has taken over) do a disk _re_assign. This will reassign *all* disks belonging _to the partner_ to a new sysid.

This should work non-disruptively.

As for your previous aggregate-reassign: How did you do it? I sure hope you

turned auto assign off on both nodes took the aggregate offline _un_assigned from the owning node (disk assign xxx -s unowned -f) assigned from the new node

look there: a new aggregate. should be the response of the new node

turn on auto-assign (if needed)

Anything else and I'm not surprised it panicked... Reassigning one-by-one (on online aggregates) leads to broken raid-groups and panics from that.

Hope that helped

Sebastian

On 31.01.2013 02:58, Jeff Cleverley wrote:

Greetings,

I'm thinking about doing something that is not supported and was wondering if anyone had done the same or has more detailed insight.

We have a very busy cluster (6040s 7.3.5.1P4). It looks like we are largely maxing out the heads for CPU. We are getting a pair of 6080s and really need to try and do the head swap live (takeover / giveback) if at all possible. The unsupported part I want to do is keep the 6040 NVRAM cards and put them in the 6080s as I swap them. The reason for this is I would not have to change the system ID ownership on all the drives.

I know changing the system ID is generally not a big deal by booting each head to maintenance mode and reassigning the old SID to the new SID. In our case it worries me. Last week we were going to move a project to the other head by reassigning the appropriate drives for a couple of aggregates. While trying to reassign these the SAS buses started panic'ing and crashed the controlling filer. The entire cluster was down. The ensuing mess took several hours to clean up.

If it crashed while trying to change ownership of a few drives, I'm afraid of what will happen when it tries to reassign all the old SID drives for the new NVRAM card. I was hoping if we could keep the cards, we could swap heads, not change SIDs, and minimize our chance of repeating the crash. I could do the disks one at a time, but I have 796 drives on this cluster and would rather not.

Is there a requirement for the hardware to have the bigger memory cards? Since there are more CPUs, I can see where maybe something needs it, I just don't know what. We will probably have a downtime in a couple of months where I can put the correct ones back in.

Thanks,

Jeff

-- Jeff Cleverley Unix Systems Administrator 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611

4667

Age (days ago)

4679

Last active (days ago)

toasters@lists.teaparty.net

6 comments

5 participants

tags (0)

participants (5)

Felix Schröder
Jeff Cleverley
Klise, Steve
Pascal Dukers
Sebastian Goetze