I do not see anything wrong in your description.

 

You can verify cabling (just to be absolutely sure) using “sysconfig -a”, “sasadmin expander_map” and “environment” commands. All give you serial numbers of SAS cables plugged in a port so you can trace them and build exact topology. SAS cable has identical serial numbers on both ends.

 

Oh, I just realized that there was known problem with SAS cables having non-unique serial numbers. So it is worth to verify. KB 2011310 explains how to check SAS connectivity.

 

 

 

 

From: Alexander Griesser [mailto:ag@anexia.at]
Sent: Monday, August 05, 2013 12:20 AM
To: Borzenkov, Andrey
Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net
Subject: AW: AW: System status degraded after hot adding 2 DS4243 shelves

 

Hey,

 

this is the cabling as it looked earlier  (with just the 4 shelves attached):

 

 

In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.

 

Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers.

Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail.

Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.

 

This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:

 

 

Anything wrong with that or the way we integrated the new shelves?

 

Bye,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-463-208501-320

Telefax: +43-463-208501-500

 

E-Mail: ag@anexia.at

Web: http://www.anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com]
Gesendet: Sonntag, 04. August 2013 17:17
An: Alexander Griesser
Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves

 

Check cabling - it looks like two channels are cabled in different order. 

Отправлено с iPhone


04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.at> написал(а):

Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.

 

This is what `sasadmin shelf` shows on the controller with the problem:

 

robin> sasadmin shelf

 

Expanders on channel 0a:

     +-------------------+

   5 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   4 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   3 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   2 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   6 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   1 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

 

Expanders on channel 0b:

     +-------------------+

   1 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   2 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   3 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   4 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   5 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

     +-------------------+

   6 |  0 |  1 |  2 |  3 |

     |  4 |  5 |  6 |  7 |

     |  8 |  9 | 10 | 11 |

     | 12 | 13 | 14 | 15 |

     | 16 | 17 | 18 | 19 |

     | 20 | 21 | 22 | 23 |

     +-------------------+

 

ACP lists the shelves twice too:

 

robin> acpadmin list_all

IP              MAC                Reset  Last Contact  Protocol   Assigner    Shelf             Current Inband   IOM

Address         Address            Cnt    (seconds ago) Version    ACPA ID     S/N               State   ID       Type

----------------------------------------------------------------------------------------------------------------------

192.168.0.191   00:50:cc:76:ac:bf  000      362         1.1.1.31   2013884700   SHJHU000001030A   0x5    0b.06.B  IOM3

192.168.1.7     00:50:cc:76:ad:06  000      359         1.1.1.31   2013885201   SHJHU000001030A   0x5    0b.06.A  IOM3

192.168.1.8     00:50:cc:76:ad:07  000      370         1.1.1.31   2013885201   SHJHU000001030B   0x5    0b.05.A  IOM3

192.168.1.13    00:50:cc:76:ad:0d  000      358         1.1.1.31   2013884700   SHJHU000001030B   0x5    0b.05.B  IOM3

192.168.1.139   00:50:cc:75:15:8b  000      408         1.1.1.31   2013884700   SHX0971733H2BSA   0x5    0b.04.B  IOM3

192.168.1.145   00:50:cc:75:15:90  000      218         1.1.1.31   2013884700   SHX0971733H2BS8   0x5    0b.01.A  IOM3

192.168.1.147   00:50:cc:75:15:92  000      82          1.1.1.31   2013884700   SHX0971733H2BS8   0x5    0b.01.B  IOM3

192.168.1.151   00:50:cc:75:15:97  000      535         1.1.1.31   2013884700   SHX0971733H2BSA   0x5    0b.04.A  IOM3

192.168.1.155   00:50:cc:75:15:9a  000      600         1.1.1.31   2013884700   SHX0971733H2BSD   0x5    0b.02.B  IOM3

192.168.1.163   00:50:cc:75:15:a2  000      495         1.1.1.31   2013884700   SHX0971733H2BSD   0x5    0b.02.A  IOM3

192.168.3.171   00:50:cc:75:2f:ab  000      393         1.1.1.31   2013884700   SHX0971733H2BSF   0x5    0b.03.B  IOM3

192.168.3.189   00:50:cc:75:2f:bd  000      40          1.1.1.31   2013884700   SHX0971733H2BSF   0x5    0b.03.A  IOM3

 

The new shelves are 1030A and 1030B at the end.

 

Sysconfig lists the shelves on both controllers like this:

0a:

                Shelf   1: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   2: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   3: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   4: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   5: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   6: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

 

0b:

                Shelf   1: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   2: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   3: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   4: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   5: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

                Shelf   6: IOM3  Firmware rev. IOM3 A: 0152 IOM3 B: 0152

 

Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?

 

Bye,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-463-208501-320

Telefax: +43-463-208501-500

 

E-Mail: ag@anexia.at

Web: http://www.anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com]
Gesendet: Sonntag, 04. August 2013 04:38
An: tmac; Alexander Griesser
Cc: toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves

 

All the shelf I'd's correct ?

Sent via BlackBerry from T-Mobile


From: tmac <tmacmd@gmail.com>

Date: Sat, 3 Aug 2013 22:33:56 -0400

To: Alexander Griesser<ag@anexia.at>

Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.net>

Subject: Re: System status degraded after hot adding 2 DS4243 shelves

 

If you can afford it, it might be worth a total system power cycle.

i.e. shut down both heads and all disks. wait 10-20 seconds.

Power on disks, then heads.

 

Sounds like something went screwy with during the hot-add.


--tmac

 

Tim McCarthy

Principal Consultant

 


          

 

        Clustered ONTAP                                                        Clustered ONTAP

 NCDA ID: XK7R3GEKC1QQ2LVD           RHCE6 110-107-141           NCSIE ID: C14QPHE21FR4YWD4

     Expires: 08 November 2014              Current until Aug 02, 2016         Expires: 08 November 2014

 

On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.at> wrote:

Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.

 

[…]

 

               Node: robin

           Resource: Disk 0a.06.11

           Severity: Major

     Probable Cause: Disk 0a.06.11 does not have two paths to controller

                     robin but the containing disk shelf 6 does have two

                     paths. Disk 0a.06.11 may be faulty.

    Possible Effect: Access to disk 0a.06.11 via controller robin will be

                     lost with a single hardware component failure (e.g.

                     cable, HBA, or IOM failure).

Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide.

                     2. Wait six minutes for the alert condition to clear.

                     3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11.

                     4. Wait six minutes for the alert condition to clear.

                     5. Contact support personnel if the alert persists.

 

               Node: robin

           Resource: Disk 0a.06.10

           Severity: Major

     Probable Cause: Disk 0a.06.10 does not have two paths to controller

                     robin but the containing disk shelf 6 does have two

                     paths. Disk 0a.06.10 may be faulty.

    Possible Effect: Access to disk 0a.06.10 via controller robin will be

                     lost with a single hardware component failure (e.g.

                     cable, HBA, or IOM failure).

Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.

                     2. Wait six minutes for the alert condition to clear.

                     3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10.

                     4. Wait six minutes for the alert condition to clear.

                     5. Contact support personnel if the alert persists.

 

48 entries were displayed.

 

The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.

 

Bye,

 

Alexander Griesser

System-Administrator

 

ANEXIA Internetdienstleistungs GmbH

 

Telefon: +43-463-208501-320

Telefax: +43-463-208501-500

 

E-Mail: ag@anexia.at

Web: http://www.anexia.at

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: tmac [mailto:tmacmd@gmail.com]
Gesendet: Sonntag, 04. August 2013 02:40
An: Alexander Griesser
Cc: chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net


Betreff: Re: System status degraded after hot adding 2 DS4243 shelves

 

It did not say all 48 disks...just this:

 

Reseat disk 0a.06.10 following


--tmac

 

Tim McCarthy

Principal Consultant

 

          

 

        Clustered ONTAP                                                        Clustered ONTAP

 NCDA ID: XK7R3GEKC1QQ2LVD           RHCE6 110-107-141           NCSIE ID: C14QPHE21FR4YWD4

     Expires: 08 November 2014              Current until Aug 02, 2016         Expires: 08 November 2014

 

On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.at> wrote:

All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?


Alexander Griesser
System-Administrator

ANEXIA Internetdienstleistungs GmbH

Telefon: +43-463-208501-320
Telefax: +43-463-208501-500

E-Mail: ag@anexia.at
Web: http://www.anexia.at

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
Geschäftsführer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

-----Ursprüngliche Nachricht-----
Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com]
Gesendet: Sonntag, 04. August 2013 02:12
An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves


The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad

Are all the led's green ?
Sent via BlackBerry from T-Mobile

-----Original Message-----
From: Alexander Griesser <ag@anexia.at>
Sender: toasters-bounces@teaparty.net
Date: Sat, 3 Aug 2013 23:44:20
To: Toasters@teaparty.net<Toasters@teaparty.net>
Subject: System status degraded after hot adding 2 DS4243 shelves

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters



_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

 

 

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters