Hey there,
the second question today, sorry for that :) I did attach two DS4243 shelves to a FAS3240HA system today and followed all the instructions, the shelf firmware has been updated in this process and everything looks good as far as I can tell, I can see the 6 shelves now on both controllers, all shelves are multi path ready, disks are visible, etc. One controller shows its health status as OK but the other one shows degraded:
robin> system health status show Status --------------- Degraded
The reason for that seems to be:
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
There are exactly 48 entries in the log for all the new 48 disks I added to the systems. If it would really be a disk or shelf fault, I guess I should see these errors on the second controller too, but I cannot see them there. Any ideas what I could to now to diagnose this issue further? I think the usual problems like broken cables, etc. can be ruled out since the second controller is not showing these issues, right?
Thanks in advance, bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
-----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser ag@anexia.at wrote:
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
-----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[...]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that's what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser ag@anexia.at wrote:
Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.****
[…]****
Node: robin**** Resource: Disk 0a.06.11**** Severity: Major**** Probable Cause: Disk 0a.06.11 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.11 may be faulty.**** Possible Effect: Access to disk 0a.06.11 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.11 fails to clear the
alert condition, replace disk 0a.06.11.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
Node: robin**** Resource: Disk 0a.06.10**** Severity: Major**** Probable Cause: Disk 0a.06.10 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.10 may be faulty.**** Possible Effect: Access to disk 0a.06.10 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.10 fails to clear the
alert condition, replace disk 0a.06.10.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
*48 entries were displayed.*
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.****
Bye,****
*Alexander Griesser*
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* tmac [mailto:tmacmd@gmail.com] *Gesendet:* Sonntag, 04. August 2013 02:40 *An:* Alexander Griesser *Cc:* chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net
*Betreff:* Re: System status degraded after hot adding 2 DS4243 shelves***
It did not say all 48 disks...just this:****
Reseat disk 0a.06.10 following****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser ag@anexia.at wrote:** **
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?****
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
-----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves****
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters****
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: tmac tmacmd@gmail.com Date: Sat, 3 Aug 2013 22:33:56 To: Alexander Griesserag@anexia.at Cc: chaim.rieger@gmail.comchaim.rieger@gmail.com; toasters-bounces@teaparty.nettoasters-bounces@teaparty.net; Toasters@teaparty.netToasters@teaparty.net Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser ag@anexia.at wrote:
Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.****
[…]****
Node: robin**** Resource: Disk 0a.06.11**** Severity: Major**** Probable Cause: Disk 0a.06.11 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.11 may be faulty.**** Possible Effect: Access to disk 0a.06.11 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.11 fails to clear the
alert condition, replace disk 0a.06.11.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
Node: robin**** Resource: Disk 0a.06.10**** Severity: Major**** Probable Cause: Disk 0a.06.10 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.10 may be faulty.**** Possible Effect: Access to disk 0a.06.10 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.10 fails to clear the
alert condition, replace disk 0a.06.10.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
*48 entries were displayed.*
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.****
Bye,****
*Alexander Griesser*
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* tmac [mailto:tmacmd@gmail.com] *Gesendet:* Sonntag, 04. August 2013 02:40 *An:* Alexander Griesser *Cc:* chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net
*Betreff:* Re: System status degraded after hot adding 2 DS4243 shelves***
It did not say all 48 disks...just this:****
Reseat disk 0a.06.10 following****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser ag@anexia.at wrote:** **
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?****
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
-----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves****
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters****
Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a "reboot shelf" command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[...]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that's what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а):
Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
[cid:image001.png@01CE9160.E7A71DA0]
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers. Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail. Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
[cid:image002.png@01CE9160.E7A71DA0]
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Sonntag, 04. August 2013 17:17 An: Alexander Griesser Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а): Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
http://support.netapp.com/eservice/toolchest?toolid=619
You know, it might be a hell-of-an-idea to download the Config Advisor (Formerlly known as WireGauge) Run that and see what happens.
It will tell you if everything is hooked up properly, detect bad disks, etc.
VERY useful tool. I believe it must run on a WINDOZE platform though.
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sun, Aug 4, 2013 at 4:19 PM, Alexander Griesser ag@anexia.at wrote:
Hey,****
this is the cabling as it looked earlier (with just the 4 shelves attached):****
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.****
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers.****
Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail.****
Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.****
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:****
Anything wrong with that or the way we integrated the new shelves?****
Bye,****
*Alexander Griesser*
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] *Gesendet:* Sonntag, 04. August 2013 17:17 *An:* Alexander Griesser *Cc:* chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net *Betreff:* Re: AW: System status degraded after hot adding 2 DS4243 shelves****
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone****
04.08.2013, в 15:45, "Alexander Griesser" ag@anexia.at написал(а):****
Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.****
This is what `sasadmin shelf` shows on the controller with the problem:***
robin> sasadmin shelf****
Expanders on channel 0a:****
+-------------------+****
5 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
4 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
3 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
2 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
6 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
1 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
Expanders on channel 0b:****
+-------------------+****
1 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
2 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
3 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
4 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
5 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
6 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
ACP lists the shelves twice too:****
robin> acpadmin list_all****
IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM****
Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type****
192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3****
192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3****
192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3****
192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3****
192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3****
192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3****
192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3****
192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3****
192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3****
192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3****
192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3****
192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3****
The new shelves are 1030A and 1030B at the end.****
Sysconfig lists the shelves on both controllers like this:****
0a:****
Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
0b:****
Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?****
Bye,****
*Alexander Griesser*****
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.comchaim.rieger@gmail.com]
*Gesendet:* Sonntag, 04. August 2013 04:38 *An:* tmac; Alexander Griesser *Cc:* toasters-bounces@teaparty.net; Toasters@teaparty.net *Betreff:* Re: System status degraded after hot adding 2 DS4243 shelves***
All the shelf I'd's correct ?****
Sent via BlackBerry from T-Mobile****
*From: *tmac tmacmd@gmail.com ****
*Date: *Sat, 3 Aug 2013 22:33:56 -0400****
*To: *Alexander Griesserag@anexia.at****
*Cc: *chaim.rieger@gmail.comchaim.rieger@gmail.com; toasters-bounces@teaparty.nettoasters-bounces@teaparty.net; Toasters@teaparty.netToasters@teaparty.net****
*Subject: *Re: System status degraded after hot adding 2 DS4243 shelves***
If you can afford it, it might be worth a total system power cycle.****
i.e. shut down both heads and all disks. wait 10-20 seconds.****
Power on disks, then heads.****
Sounds like something went screwy with during the hot-add.****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser ag@anexia.at wrote:** **
Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.****
[…]****
Node: robin**** Resource: Disk 0a.06.11**** Severity: Major**** Probable Cause: Disk 0a.06.11 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.11 may be faulty.**** Possible Effect: Access to disk 0a.06.11 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.11 fails to clear the
alert condition, replace disk 0a.06.11.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
Node: robin**** Resource: Disk 0a.06.10**** Severity: Major**** Probable Cause: Disk 0a.06.10 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.10 may be faulty.**** Possible Effect: Access to disk 0a.06.10 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.10 fails to clear the
alert condition, replace disk 0a.06.10.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
*48 entries were displayed.*****
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.****
Bye,****
*Alexander Griesser*****
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* tmac [mailto:tmacmd@gmail.com] *Gesendet:* Sonntag, 04. August 2013 02:40 *An:* Alexander Griesser *Cc:* chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net****
*Betreff:* Re: System status degraded after hot adding 2 DS4243 shelves***
It did not say all 48 disks...just this:****
Reseat disk 0a.06.10 following****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser ag@anexia.at wrote:** **
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?****
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
-----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves****
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters****
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters****
Thanks, great idea – I had that one already installed but didn’t think about running it here for whatever reason :-/
This is the output:
[cid:image003.png@01CE9164.6B7D6460]
The only thing it doesn’t like in the summary at the bottom is:
[cid:image004.png@01CE9164.6B7D6460]
This is something I really didn’t know :-/ So I guess I would have to do some recabling if that is possible, can I just plug off 0b for example and plug it on to 0c? Would that work or would that kill everything?
But besides that, it even shows the disks as having multipath connections on the two shelves where the controller robin itself says that it does not have multipath….
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 22:32 An: Alexander Griesser Cc: Borzenkov, Andrey; chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
http://support.netapp.com/eservice/toolchest?toolid=619
You know, it might be a hell-of-an-idea to download the Config Advisor (Formerlly known as WireGauge) Run that and see what happens.
It will tell you if everything is hooked up properly, detect bad disks, etc.
VERY useful tool. I believe it must run on a WINDOZE platform though.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sun, Aug 4, 2013 at 4:19 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
[cid:image005.png@01CE9164.6B7D6460]
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers. Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail. Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
[cid:image006.png@01CE9164.6B7D6460]
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.commailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Sonntag, 04. August 2013 17:17 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а): Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700tel:2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201tel:2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201tel:2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700tel:2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Hard to say. I would open a case to find out for sure.
At one point (and it may *still be true*) there were a few bugs in ONTAP such that if you disconnected a shelf/stack from the system and tried plugging into another fc/sas port, there would be (not-nice) things that *could* happen before your next reboot.
Might want to find out if that is still true.
At any rate, 0a should not be on the same stack as 0b unless that is your only choice. i.e. you have a system with exactly two sas/fc ports and they happen to occupy the same ASIC.
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sun, Aug 4, 2013 at 4:44 PM, Alexander Griesser ag@anexia.at wrote:
Thanks, great idea – I had that one already installed but didn’t think about running it here for whatever reason :-/****
This is the output:****
The only thing it doesn’t like in the summary at the bottom is:****
This is something I really didn’t know :-/ So I guess I would have to do some recabling if that is possible, can I just plug off 0b for example and plug it on to 0c? Would that work or would that kill everything?****
But besides that, it even shows the disks as having multipath connections on the two shelves where the controller robin itself says that it does not have multipath….****
Bye,****
*Alexander Griesser*
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* tmac [mailto:tmacmd@gmail.com] *Gesendet:* Sonntag, 04. August 2013 22:32 *An:* Alexander Griesser *Cc:* Borzenkov, Andrey; chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net
*Betreff:* Re: AW: System status degraded after hot adding 2 DS4243 shelves****
http://support.netapp.com/eservice/toolchest?toolid=619****
You know, it might be a hell-of-an-idea to download the Config Advisor (Formerlly known as WireGauge)****
Run that and see what happens.****
It will tell you if everything is hooked up properly, detect bad disks, etc.****
VERY useful tool. I believe it must run on a WINDOZE platform though.****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sun, Aug 4, 2013 at 4:19 PM, Alexander Griesser ag@anexia.at wrote:** **
Hey,****
this is the cabling as it looked earlier (with just the 4 shelves attached):****
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.****
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers.****
Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail.****
Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.****
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:****
Anything wrong with that or the way we integrated the new shelves?****
Bye,****
*Alexander Griesser*****
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] *Gesendet:* Sonntag, 04. August 2013 17:17 *An:* Alexander Griesser *Cc:* chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net *Betreff:* Re: AW: System status degraded after hot adding 2 DS4243 shelves****
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone****
04.08.2013, в 15:45, "Alexander Griesser" ag@anexia.at написал(а):****
Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.****
This is what `sasadmin shelf` shows on the controller with the problem:***
robin> sasadmin shelf****
Expanders on channel 0a:****
+-------------------+****
5 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
4 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
3 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
2 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
6 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
1 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
Expanders on channel 0b:****
+-------------------+****
1 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
2 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
3 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
4 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
5 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
+-------------------+****
6 | 0 | 1 | 2 | 3 |****
| 4 | 5 | 6 | 7 |**** | 8 | 9 | 10 | 11 |**** | 12 | 13 | 14 | 15 |**** | 16 | 17 | 18 | 19 |**** | 20 | 21 | 22 | 23 |**** +-------------------+****
ACP lists the shelves twice too:****
robin> acpadmin list_all****
IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM****
Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type****
192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3****
192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3****
192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3****
192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3****
192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3****
192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3****
192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3****
192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3****
192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3****
192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3****
192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3****
192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3****
The new shelves are 1030A and 1030B at the end.****
Sysconfig lists the shelves on both controllers like this:****
0a:****
Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
0b:****
Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152**
**
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?****
Bye,****
*Alexander Griesser*****
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.comchaim.rieger@gmail.com]
*Gesendet:* Sonntag, 04. August 2013 04:38 *An:* tmac; Alexander Griesser *Cc:* toasters-bounces@teaparty.net; Toasters@teaparty.net *Betreff:* Re: System status degraded after hot adding 2 DS4243 shelves***
All the shelf I'd's correct ?****
Sent via BlackBerry from T-Mobile****
*From: *tmac tmacmd@gmail.com ****
*Date: *Sat, 3 Aug 2013 22:33:56 -0400****
*To: *Alexander Griesserag@anexia.at****
*Cc: *chaim.rieger@gmail.comchaim.rieger@gmail.com; toasters-bounces@teaparty.nettoasters-bounces@teaparty.net; Toasters@teaparty.netToasters@teaparty.net****
*Subject: *Re: System status degraded after hot adding 2 DS4243 shelves***
If you can afford it, it might be worth a total system power cycle.****
i.e. shut down both heads and all disks. wait 10-20 seconds.****
Power on disks, then heads.****
Sounds like something went screwy with during the hot-add.****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser ag@anexia.at wrote:** **
Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.****
[…]****
Node: robin**** Resource: Disk 0a.06.11**** Severity: Major**** Probable Cause: Disk 0a.06.11 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.11 may be faulty.**** Possible Effect: Access to disk 0a.06.11 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.11 fails to clear the
alert condition, replace disk 0a.06.11.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
Node: robin**** Resource: Disk 0a.06.10**** Severity: Major**** Probable Cause: Disk 0a.06.10 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.10 may be faulty.**** Possible Effect: Access to disk 0a.06.10 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.10 fails to clear the
alert condition, replace disk 0a.06.10.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
*48 entries were displayed.*****
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.****
Bye,****
*Alexander Griesser*****
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
*Von:* tmac [mailto:tmacmd@gmail.com] *Gesendet:* Sonntag, 04. August 2013 02:40 *An:* Alexander Griesser *Cc:* chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net****
*Betreff:* Re: System status degraded after hot adding 2 DS4243 shelves***
It did not say all 48 disks...just this:****
Reseat disk 0a.06.10 following****
--tmac****
*Tim McCarthy*****
*Principal Consultant*****
****
Clustered ONTAP Clustered ONTAP****
NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4****
Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014****
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser ag@anexia.at wrote:** **
All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?****
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
-----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves****
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser ag@anexia.at Sender: toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netToasters@teaparty.net Subject: System status degraded after hot adding 2 DS4243 shelves
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters****
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters****
OK, thanks, I’ll definitely talk to NetApp and will fix the cabling here. But since this has already been here for a while, it should not be the reason for the problem I’m seeing now.
Although I’d love to know what the real problem here is with the degraded system status, I think I will just go ahead and do an OnTap upgrade and also upgrade the sytems firmware and serviceprocessor image (as config advisor reminded me), and maybe after booting both controllers, the error is gone then.
Thanks for all your help so far, bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 23:05 An: Alexander Griesser Cc: Borzenkov, Andrey; chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Hard to say. I would open a case to find out for sure.
At one point (and it may *still be true*) there were a few bugs in ONTAP such that if you disconnected a shelf/stack from the system and tried plugging into another fc/sas port, there would be (not-nice) things that *could* happen before your next reboot.
Might want to find out if that is still true.
At any rate, 0a should not be on the same stack as 0b unless that is your only choice. i.e. you have a system with exactly two sas/fc ports and they happen to occupy the same ASIC.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sun, Aug 4, 2013 at 4:44 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Thanks, great idea – I had that one already installed but didn’t think about running it here for whatever reason :-/
This is the output:
[cid:image001.png@01CE9168.68EDE310]
The only thing it doesn’t like in the summary at the bottom is:
[cid:image002.png@01CE9168.68EDE310]
This is something I really didn’t know :-/ So I guess I would have to do some recabling if that is possible, can I just plug off 0b for example and plug it on to 0c? Would that work or would that kill everything?
But besides that, it even shows the disks as having multipath connections on the two shelves where the controller robin itself says that it does not have multipath….
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 22:32 An: Alexander Griesser Cc: Borzenkov, Andrey; chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
http://support.netapp.com/eservice/toolchest?toolid=619
You know, it might be a hell-of-an-idea to download the Config Advisor (Formerlly known as WireGauge) Run that and see what happens.
It will tell you if everything is hooked up properly, detect bad disks, etc.
VERY useful tool. I believe it must run on a WINDOZE platform though.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sun, Aug 4, 2013 at 4:19 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
[cid:image003.png@01CE9168.68EDE310]
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers. Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail. Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
[cid:image004.png@01CE9168.68EDE310]
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.commailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Sonntag, 04. August 2013 17:17 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а): Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700tel:2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201tel:2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201tel:2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700tel:2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700tel:2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
I do not see anything wrong in your description.
You can verify cabling (just to be absolutely sure) using “sysconfig -a”, “sasadmin expander_map” and “environment” commands. All give you serial numbers of SAS cables plugged in a port so you can trace them and build exact topology. SAS cable has identical serial numbers on both ends.
Oh, I just realized that there was known problem with SAS cables having non-unique serial numbers. So it is worth to verify. KB 2011310 explains how to check SAS connectivity.
From: Alexander Griesser [mailto:ag@anexia.at] Sent: Monday, August 05, 2013 12:20 AM To: Borzenkov, Andrey Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net Subject: AW: AW: System status degraded after hot adding 2 DS4243 shelves
Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
[cid:image001.png@01CE91BC.4AC6BA20]
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers. Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail. Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
[cid:image002.png@01CE91BC.4AC6BA20]
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Sonntag, 04. August 2013 17:17 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а): Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Hi Andrey,
thanks, sasadmin expander_map correlates to the cabling scheme I posted earlier, that is:
0a: Level 1, Shelf ID 6 down to Level 6, Shelf ID 1 0b: Level 1, Shelf ID 1 up to Level 6, Shelf ID 6
So that’s correct for both controllers. I’ll try the reboot in the next few days combined with a system firmware and OnTap upgrade to see if the issue clears then.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Montag, 05. August 2013 07:18 An: Alexander Griesser Cc: chaim.rieger@gmail.com; tmac; Toasters@teaparty.net Betreff: RE: AW: System status degraded after hot adding 2 DS4243 shelves
I do not see anything wrong in your description.
You can verify cabling (just to be absolutely sure) using “sysconfig -a”, “sasadmin expander_map” and “environment” commands. All give you serial numbers of SAS cables plugged in a port so you can trace them and build exact topology. SAS cable has identical serial numbers on both ends.
Oh, I just realized that there was known problem with SAS cables having non-unique serial numbers. So it is worth to verify. KB 2011310 explains how to check SAS connectivity.
From: Alexander Griesser [mailto:ag@anexia.at] Sent: Monday, August 05, 2013 12:20 AM To: Borzenkov, Andrey Cc: chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.net; Toasters@teaparty.net Subject: AW: AW: System status degraded after hot adding 2 DS4243 shelves
Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
[cid:image001.png@01CE91FD.890C9660]
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers. Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail. Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
[cid:image002.png@01CE91FD.890C9660]
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Sonntag, 04. August 2013 17:17 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а): Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[Das Bild wurde vom Absender entfernt.] [Das Bild wurde vom Absender entfernt.] [Das Bild wurde vom Absender entfernt.]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[Das Bild wurde vom Absender entfernt.] [Das Bild wurde vom Absender entfernt.] [Das Bild wurde vom Absender entfernt.]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Hi all,
just wanted to let you know that after a non disruptive upgrade to 8.1.3P1 the system status is OK on both controllers again, also the shelf ids are subsequent again in `sasadmin shelf`.
So problem resolved after a reboot, I guess.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Montag, 05. August 2013 07:18 An: Alexander Griesser Cc: chaim.rieger@gmail.com; tmac; Toasters@teaparty.net Betreff: RE: AW: System status degraded after hot adding 2 DS4243 shelves
I do not see anything wrong in your description.
You can verify cabling (just to be absolutely sure) using “sysconfig -a”, “sasadmin expander_map” and “environment” commands. All give you serial numbers of SAS cables plugged in a port so you can trace them and build exact topology. SAS cable has identical serial numbers on both ends.
Oh, I just realized that there was known problem with SAS cables having non-unique serial numbers. So it is worth to verify. KB 2011310 explains how to check SAS connectivity.
From: Alexander Griesser [mailto:ag@anexia.at] Sent: Monday, August 05, 2013 12:20 AM To: Borzenkov, Andrey Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Subject: AW: AW: System status degraded after hot adding 2 DS4243 shelves
Hey,
this is the cabling as it looked earlier (with just the 4 shelves attached):
[cid:image001.png@01CE9D20.D5759410]
In sasadmin shelf, I saw 1,2,3,4 and 4,3,2,1 for the two SAS controllers on each head.
Then, we disconnected the two long cables (red and green to the left of the scheme) and added them to the shelf with ID 6, once that was done, I was seeing 1,2,3,4 and 6 for the two SAS controllers. Then I daisychained my way down from 6 to 5, which gave me 1,2,3,4 and 6,5 on both heads and after I connected shelf 4 with shelf 5, all shelves were shown in the sasadmin shelf output twice, but the order has changed then as you can see in my other e-mail. Just FWIW: I did add another DS4243 to a different HA pair on the same day and there were not issues at all, it also does now show a strange order of the shelfs in the sasadmin output, which it didn’t do before, but on this system, everything is OK.
This is what the cabling now looks like, according to the technician who was in the datacenter and physically installed the shelves:
[cid:image002.png@01CE9D20.D5759410]
Anything wrong with that or the way we integrated the new shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com] Gesendet: Sonntag, 04. August 2013 17:17 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; tmac; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: AW: System status degraded after hot adding 2 DS4243 shelves
Check cabling - it looks like two channels are cabled in different order.
Отправлено с iPhone
04.08.2013, в 15:45, "Alexander Griesser" <ag@anexia.atmailto:ag@anexia.at> написал(а): Yes, we checked them and they are all unique. The existing shelves had the ids 1 to 4, the new ones got 5 and 6.
This is what `sasadmin shelf` shows on the controller with the problem:
robin> sasadmin shelf
Expanders on channel 0a: +-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
Expanders on channel 0b: +-------------------+ 1 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 2 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 3 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 4 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 5 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
+-------------------+ 6 | 0 | 1 | 2 | 3 | | 4 | 5 | 6 | 7 | | 8 | 9 | 10 | 11 | | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | +-------------------+
ACP lists the shelves twice too:
robin> acpadmin list_all IP MAC Reset Last Contact Protocol Assigner Shelf Current Inband IOM Address Address Cnt (seconds ago) Version ACPA ID S/N State ID Type ---------------------------------------------------------------------------------------------------------------------- 192.168.0.191 00:50:cc:76:ac:bf 000 362 1.1.1.31 2013884700 SHJHU000001030A 0x5 0b.06.B IOM3 192.168.1.7 00:50:cc:76:ad:06 000 359 1.1.1.31 2013885201 SHJHU000001030A 0x5 0b.06.A IOM3 192.168.1.8 00:50:cc:76:ad:07 000 370 1.1.1.31 2013885201 SHJHU000001030B 0x5 0b.05.A IOM3 192.168.1.13 00:50:cc:76:ad:0d 000 358 1.1.1.31 2013884700 SHJHU000001030B 0x5 0b.05.B IOM3 192.168.1.139 00:50:cc:75:15:8b 000 408 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.B IOM3 192.168.1.145 00:50:cc:75:15:90 000 218 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.A IOM3 192.168.1.147 00:50:cc:75:15:92 000 82 1.1.1.31 2013884700 SHX0971733H2BS8 0x5 0b.01.B IOM3 192.168.1.151 00:50:cc:75:15:97 000 535 1.1.1.31 2013884700 SHX0971733H2BSA 0x5 0b.04.A IOM3 192.168.1.155 00:50:cc:75:15:9a 000 600 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.B IOM3 192.168.1.163 00:50:cc:75:15:a2 000 495 1.1.1.31 2013884700 SHX0971733H2BSD 0x5 0b.02.A IOM3 192.168.3.171 00:50:cc:75:2f:ab 000 393 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.B IOM3 192.168.3.189 00:50:cc:75:2f:bd 000 40 1.1.1.31 2013884700 SHX0971733H2BSF 0x5 0b.03.A IOM3
The new shelves are 1030A and 1030B at the end.
Sysconfig lists the shelves on both controllers like this: 0a: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
0b: Shelf 1: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 2: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 3: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 4: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 5: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152 Shelf 6: IOM3 Firmware rev. IOM3 A: 0152 IOM3 B: 0152
Since I do not have any disks assigned to aggregates right now, is there some kind of a „reboot shelf“ command as there was for DS14 shelves?
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 04:38 An: tmac; Alexander Griesser Cc: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
All the shelf I'd's correct ? Sent via BlackBerry from T-Mobile ________________________________ From: tmac <tmacmd@gmail.commailto:tmacmd@gmail.com> Date: Sat, 3 Aug 2013 22:33:56 -0400 To: Alexander Griesser<ag@anexia.atmailto:ag@anexia.at> Cc: chaim.rieger@gmail.com<chaim.rieger@gmail.commailto:chaim.rieger@gmail.com%3cchaim.rieger@gmail.com>; toasters-bounces@teaparty.net<toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net%3ctoasters-bounces@teaparty.net>; Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net%3cToasters@teaparty.net> Subject: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[…]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that’s what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
That's the really last resort, which I hope I can work around...
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 04:34 An: Alexander Griesser Cc: chaim.rieger@gmail.com; toasters-bounces@teaparty.net; Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
If you can afford it, it might be worth a total system power cycle. i.e. shut down both heads and all disks. wait 10-20 seconds. Power on disks, then heads.
Sounds like something went screwy with during the hot-add.
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:43 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Actually, it say to reseat all 48, I just pasted the message for one, sorry for not being precise about that.
[...]
Node: robin Resource: Disk 0a.06.11 Severity: Major Probable Cause: Disk 0a.06.11 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.11 may be faulty. Possible Effect: Access to disk 0a.06.11 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.11 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.11 fails to clear the alert condition, replace disk 0a.06.11. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
The one I quoted was the last one, the last two are shown above and the message at the bottom, now in red, indicates 48 entries of this type and when I scroll up in the output of the filer, it has an alert message for all of the 48 new disks, that's what puzzles me.
Bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.commailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:40 An: Alexander Griesser Cc: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net
Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
It did not say all 48 disks...just this:
Reseat disk 0a.06.10 following
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 8:15 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: All LEDs green, all SAS ports show link, even the controller LEDs are both green which is what puzzled me a bit, since it should light up in orange, shouldn't it?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601 -----Ursprüngliche Nachricht----- Von: chaim.rieger@gmail.commailto:chaim.rieger@gmail.com [mailto:chaim.rieger@gmail.commailto:chaim.rieger@gmail.com] Gesendet: Sonntag, 04. August 2013 02:12 An: Alexander Griesser; toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net; Toasters@teaparty.netmailto:Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
The return cable from the last shelf to your defgraded filer is either bad, not connected properly or the slot is bad
Are all the led's green ? Sent via BlackBerry from T-Mobile
-----Original Message----- From: Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> Sender: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net Date: Sat, 3 Aug 2013 23:44:20 To: Toasters@teaparty.netmailto:Toasters@teaparty.net<Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: System status degraded after hot adding 2 DS4243 shelves
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Did you do as the system asked? reset the disk, etc.?
--tmac
*Tim McCarthy* *Principal Consultant*
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 7:44 PM, Alexander Griesser ag@anexia.at wrote:
Hey there,****
the second question today, sorry for that J****
I did attach two DS4243 shelves to a FAS3240HA system today and followed all the instructions, the shelf firmware has been updated in this process and everything looks good as far as I can tell, I can see the 6 shelves now on both controllers, all shelves are multi path ready, disks are visible, etc.****
One controller shows its health status as OK but the other one shows degraded:****
robin> system health status show****
Status****
---------------****
Degraded****
The reason for that seems to be:****
Node: robin**** Resource: Disk 0a.06.10**** Severity: Major**** Probable Cause: Disk 0a.06.10 does not have two paths to controller**
**
robin but the containing disk shelf 6 does have two**
**
paths. Disk 0a.06.10 may be faulty.**** Possible Effect: Access to disk 0a.06.10 via controller robin will be*
lost with a single hardware component failure (e.g.**
**
cable, HBA, or IOM failure).****
Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide.****
2. Wait six minutes for the alert condition to clear.
3. If reseating disk 0a.06.10 fails to clear the
alert condition, replace disk 0a.06.10.****
4. Wait six minutes for the alert condition to clear.
5. Contact support personnel if the alert persists.**
**
48 entries were displayed.****
There are exactly 48 entries in the log for all the new 48 disks I added to the systems. If it would really be a disk or shelf fault, I guess I should see these errors on the second controller too, but I cannot see them there.****
Any ideas what I could to now to diagnose this issue further? I think the usual problems like broken cables, etc. can be ruled out since the second controller is not showing these issues, right?****
Thanks in advance,****
bye,****
*Alexander Griesser*
System-Administrator****
ANEXIA Internetdienstleistungs GmbH****
Telefon: +43-463-208501-320****
Telefax: +43-463-208501-500****
E-Mail: ag@anexia.at****
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt***
Geschäftsführer: Alexander Windbichler****
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601****
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
No, I did not reseat all the 48 disks yet - if the message would have been just for one disk, I would of course have reseated it as of yet, but since it shows all 48 disks in the alert summary, I highly doubt that the problem is with the disks, or am I wrong in that?
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320 Telefax: +43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
Von: tmac [mailto:tmacmd@gmail.com] Gesendet: Sonntag, 04. August 2013 02:13 An: Alexander Griesser Cc: Toasters@teaparty.net Betreff: Re: System status degraded after hot adding 2 DS4243 shelves
Did you do as the system asked? reset the disk, etc.?
--tmac
Tim McCarthy Principal Consultant
[http://dl.dropbox.com/u/6874230/na_cert_dma_2c.jpg] [http://dl.dropbox.com/u/6874230/rhce.jpeg] [http://dl.dropbox.com/u/6874230/na_cert_ie-san_2c.jpg]
Clustered ONTAP Clustered ONTAP NCDA ID: XK7R3GEKC1QQ2LVD RHCE6 110-107-141https://www.redhat.com/wapps/training/certification/verify.html?certNumber=110-107-141&isSearch=False&verify=Verify NCSIE ID: C14QPHE21FR4YWD4 Expires: 08 November 2014 Current until Aug 02, 2016 Expires: 08 November 2014
On Sat, Aug 3, 2013 at 7:44 PM, Alexander Griesser <ag@anexia.atmailto:ag@anexia.at> wrote: Hey there,
the second question today, sorry for that :) I did attach two DS4243 shelves to a FAS3240HA system today and followed all the instructions, the shelf firmware has been updated in this process and everything looks good as far as I can tell, I can see the 6 shelves now on both controllers, all shelves are multi path ready, disks are visible, etc. One controller shows its health status as OK but the other one shows degraded:
robin> system health status show Status --------------- Degraded
The reason for that seems to be:
Node: robin Resource: Disk 0a.06.10 Severity: Major Probable Cause: Disk 0a.06.10 does not have two paths to controller robin but the containing disk shelf 6 does have two paths. Disk 0a.06.10 may be faulty. Possible Effect: Access to disk 0a.06.10 via controller robin will be lost with a single hardware component failure (e.g. cable, HBA, or IOM failure). Corrective Actions: 1. Reseat disk 0a.06.10 following the rules in the Installation and Service Guide. 2. Wait six minutes for the alert condition to clear. 3. If reseating disk 0a.06.10 fails to clear the alert condition, replace disk 0a.06.10. 4. Wait six minutes for the alert condition to clear. 5. Contact support personnel if the alert persists.
48 entries were displayed.
There are exactly 48 entries in the log for all the new 48 disks I added to the systems. If it would really be a disk or shelf fault, I guess I should see these errors on the second controller too, but I cannot see them there. Any ideas what I could to now to diagnose this issue further? I think the usual problems like broken cables, etc. can be ruled out since the second controller is not showing these issues, right?
Thanks in advance, bye,
Alexander Griesser System-Administrator
ANEXIA Internetdienstleistungs GmbH
Telefon: +43-463-208501-320tel:%2B43-463-208501-320 Telefax: +43-463-208501-500tel:%2B43-463-208501-500
E-Mail: ag@anexia.atmailto:ag@anexia.at Web: http://www.anexia.at
Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt Geschäftsführer: Alexander Windbichler Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters