I do not see anything wrong in your description.
You can verify cabling (just to be absolutely sure) using “sysconfig -a”, “sasadmin expander_map” and “environment” commands. All give you serial numbers of SAS cables plugged in a port so you can trace them and build exact topology. SAS cable has identical serial numbers on both ends.
Oh, I just realized that there was known problem with SAS cables having non-unique serial numbers. So it is worth to verify. KB 2011310 explains how to check SAS connectivity.
From: Alexander Griesser [mailto:ag@anexia.at]
Sent: Monday, August 05, 2013 12:20 AM
To: Borzenkov, Andrey
Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net
Subject: AW: AW: System status degraded after hot adding 2 DS4243 shelves
Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers.
Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail.
Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser
System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320
Telefax: +43-463-208501-500
E-Mail: ag@anexia.at
Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
Geschäftsführer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com]
Gesendet: Sonntag, 04. August 2013 17:17
An: Alexander Griesser
Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.at> написал(а):
Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a:
+-------------------+
5 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
4 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
3 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
2 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
6 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
1 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
Expanders on channel 0b:
+-------------------+
1 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
2 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
3 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
4 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
5 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
+-------------------+
6 | 0 | 1 | 2 | 3 |
| 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 |
+-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all
IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM
Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type
----------------------------------------------------------------------------------------------------------------------
192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3
192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3
192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3
192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3
192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3
192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3
192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3
192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3
192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3
192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3
192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3
192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this:
0a:
Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b:
Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser
System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320
Telefax: +43-463-208501-500
E-Mail: ag@anexia.at
Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
Geschäftsführer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com]
Gesendet: Sonntag, 04. August 2013 04:38
An: tmac; Alexander Griesser
Cc: toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ?
Sent via BlackBerry from T-Mobile
From: tmac <tmacmd@gmail.com>
Date: Sat, 3 Aug 2013 22:33:56 -0400
To: Alexander Griesser<ag@anexia.at>
Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.net>
Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle.
i.e. shut down both heads and all disks. wait 10-20 seconds.
Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy
Principal Consultant
Clustered ONTAP Clustered ONTAP
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141 NCSIE ID: C14QPHE21FR4YWD4
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.at> wrote:
Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin
Resource: Disk 0a.06.11
Severity: Major
Probable Cause: Disk 0a.06.11 does not have two paths to controller
robin but the containing disk shelf 6 does have two
paths. Disk 0a.06.11 may be faulty.
Possible Effect: Access to disk 0a.06.11 via controller robin will be
lost with a single hardware component failure (e.g.
cable, HBA, or IOM failure).
Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide.
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11.
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.
Node: robin
Resource: Disk 0a.06.10
Severity: Major
Probable Cause: Disk 0a.06.10 does not have two paths to controller
robin but the containing disk shelf 6 does have two
paths. Disk 0a.06.10 may be faulty.
Possible Effect: Access to disk 0a.06.10 via controller robin will be
lost with a single hardware component failure (e.g.
cable, HBA, or IOM failure).
Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10.
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser
System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320
Telefax: +43-463-208501-500
E-Mail: ag@anexia.at
Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
Geschäftsführer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.com]
Gesendet: Sonntag, 04. August 2013 02:40
An: Alexander Griesser
Cc: chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy
Principal Consultant
Clustered ONTAP Clustered ONTAP
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141 NCSIE ID: C14QPHE21FR4YWD4
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.at> wrote:
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser
System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320
Telefax: +43-463-208501-500
E-Mail: ag@anexia.at
Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
Geschäftsführer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601-----Ursprüngliche Nachricht-----
Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com]
Gesendet: Sonntag, 04. August 2013 02:12
An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ?
Sent via BlackBerry from T-Mobile
-----Original Message-----
From: Alexander Griesser <ag@anexia.at>
Sender: toasters-bounces@teaparty.net
Date: Sat, 3 Aug 2013 23:44:20
To: Toasters@teaparty.net<Toasters@teaparty.net>
Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters