Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a: 82 : 0.0GB 0B/sect (Startup failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
1. New FC ports on the filer side. 2. Reseating the shelf interface cards on both shelves 1 and 6 where the FC connections come in. 3. New fibre cable between shelf and filer. 4. Swapped controller A & B connection in shelf #1. Controller A sees everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything. 5. Manually pulled a drive from a shelf and plugged it back in. 6. Verified I am not at any maximum number of drives or shelf count per cluster limit. 7. Verified both heads have the same shelf/drive firmware and qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
Something else you could try,
storage adapter disable 8d (wait) stoarge enable adapter 8d (wait about 10 seconds)
Check you disk output list again. If it works, do other paths one at a time.
Sounds odd.
--tmac
*Tim McCarthy* *Principal Consultant*
On Mon, Mar 9, 2015 at 7:41 PM, Jeff Cleverley <jeff.cleverley@avagotech.com
wrote:
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a: 82 : 0.0GB 0B/sect (Startup failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
- New FC ports on the filer side.
- Reseating the shelf interface cards on both shelves 1 and 6 where the
FC connections come in. 3. New fibre cable between shelf and filer. 4. Swapped controller A & B connection in shelf #1. Controller A sees everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything. 5. Manually pulled a drive from a shelf and plugged it back in. 6. Verified I am not at any maximum number of drives or shelf count per cluster limit. 7. Verified both heads have the same shelf/drive firmware and qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Tim,
I just tried it with no change in status. I had not thought of trying this since I plugged them into other HBA ports. I waited 30 seconds between the disable/enable and checking.
I forgot to mention there are errors in the messages log that don't help either. Here is an example from controller B:
Mon Mar 9 16:00:00 MDT [outlaw:monitor.shelf.accessError:CRITICAL]: Enclosure services has detected an error in access to shelves on channel 9d. These also follow all the other HBA connections as I've moved them around for testing. Controller A has no problems or errors with these including when I swapped the A/B connections on shelf 1.
It keeps leading me back to controller B not liking these shelves for some reason.
Thanks,
Jeff
On Mon, Mar 9, 2015 at 5:53 PM, tmac tmacmd@gmail.com wrote:
Something else you could try,
storage adapter disable 8d (wait) stoarge enable adapter 8d (wait about 10 seconds)
Check you disk output list again. If it works, do other paths one at a time.
Sounds odd.
--tmac
*Tim McCarthy* *Principal Consultant*
On Mon, Mar 9, 2015 at 7:41 PM, Jeff Cleverley < jeff.cleverley@avagotech.com> wrote:
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a: 82 : 0.0GB 0B/sect (Startup failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
- New FC ports on the filer side.
- Reseating the shelf interface cards on both shelves 1 and 6 where the
FC connections come in. 3. New fibre cable between shelf and filer. 4. Swapped controller A & B connection in shelf #1. Controller A sees everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything. 5. Manually pulled a drive from a shelf and plugged it back in. 6. Verified I am not at any maximum number of drives or shelf count per cluster limit. 7. Verified both heads have the same shelf/drive firmware and qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Yeah, I'd try the takeover giveback. but I might even go so far as to actually power-cycle the head that is misbehaving.
--tmac
*Tim McCarthy* *Principal Consultant*
On Mon, Mar 9, 2015 at 8:11 PM, Jeff Cleverley <jeff.cleverley@avagotech.com
wrote:
Tim,
I just tried it with no change in status. I had not thought of trying this since I plugged them into other HBA ports. I waited 30 seconds between the disable/enable and checking.
I forgot to mention there are errors in the messages log that don't help either. Here is an example from controller B:
Mon Mar 9 16:00:00 MDT [outlaw:monitor.shelf.accessError:CRITICAL]: Enclosure services has detected an error in access to shelves on channel 9d. These also follow all the other HBA connections as I've moved them around for testing. Controller A has no problems or errors with these including when I swapped the A/B connections on shelf 1.
It keeps leading me back to controller B not liking these shelves for some reason.
Thanks,
Jeff
On Mon, Mar 9, 2015 at 5:53 PM, tmac tmacmd@gmail.com wrote:
Something else you could try,
storage adapter disable 8d (wait) stoarge enable adapter 8d (wait about 10 seconds)
Check you disk output list again. If it works, do other paths one at a time.
Sounds odd.
--tmac
*Tim McCarthy* *Principal Consultant*
On Mon, Mar 9, 2015 at 7:41 PM, Jeff Cleverley < jeff.cleverley@avagotech.com> wrote:
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a: 82 : 0.0GB 0B/sect (Startup failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
- New FC ports on the filer side.
- Reseating the shelf interface cards on both shelves 1 and 6 where
the FC connections come in. 3. New fibre cable between shelf and filer. 4. Swapped controller A & B connection in shelf #1. Controller A sees everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything. 5. Manually pulled a drive from a shelf and plugged it back in. 6. Verified I am not at any maximum number of drives or shelf count per cluster limit. 7. Verified both heads have the same shelf/drive firmware and qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
It could be that disks have SCSI reservation set on them that prevents partner from accessing them. At least, symptoms do match exactly. I believe there are diagnostic commands to display reservations but I do not have them handy.
I would try to
a) Make sure you have up to date qual_devices on both nodes; update if needed
b) halt -f to prevent takeover - halt both partners
c) boot in maintenance mode on both
d) perform “storage release disks” on both nodes
e) reboot both nodes
What NetApp support says?
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Tuesday, March 10, 2015 2:42 AM To: Toasters@teaparty.net Subject: Shelf recognition problem with DS14s
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a: 82 : 0.0GB 0B/sect (Startup failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
1. New FC ports on the filer side. 2. Reseating the shelf interface cards on both shelves 1 and 6 where the FC connections come in. 3. New fibre cable between shelf and filer. 4. Swapped controller A & B connection in shelf #1. Controller A sees everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything. 5. Manually pulled a drive from a shelf and plugged it back in. 6. Verified I am not at any maximum number of drives or shelf count per cluster limit. 7. Verified both heads have the same shelf/drive firmware and qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
On Mon, Mar 9, 2015 at 11:25 PM, Borzenkov, Andrei < andrei.borzenkov@ts.fujitsu.com> wrote:
It could be that disks have SCSI reservation set on them that prevents partner from accessing them. At least, symptoms do match exactly. I believe there are diagnostic commands to display reservations but I do not have them handy.
I'll look into it, but I doubt it. All of these shelves were on a different cluster until ~9 months ago. They were taken down and ownership removed in maintenance mode. There were 7 other loops with 6 shelves each and all of those have been fine. This one is fine for one filer but the other refuses to see them.
I would try to
a) Make sure you have up to date qual_devices on both nodes; update if needed
Already done.
b) halt -f to prevent takeover - halt both partners
Not going to happen anytime soon. No downtime will be scheduled for several months at a minimum.
c) boot in maintenance mode on both
d) perform “storage release disks” on both nodes
e) reboot both nodes
What NetApp support says?
Similar type of thing. They were focused more on the shelves. They wanted me to pull every controller on every shelf in the loop. I pulled shelf 1 and shelf 6 with no luck. Since the other controller sees the same shelves through the same shelf interconnect cables, it tells me those are not the problem.
In the mean time I've removed all disk ownership for every disk in that loop. It will have to wait until I can get a downtime.
Thanks,
Jeff
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Jeff Cleverley *Sent:* Tuesday, March 10, 2015 2:42 AM *To:* Toasters@teaparty.net *Subject:* Shelf recognition problem with DS14s
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a:
82 : 0.0GB 0B/sect (Startup
failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
New FC ports on the filer side.
Reseating the shelf interface cards on both shelves 1 and 6 where the
FC connections come in.
New fibre cable between shelf and filer.
Swapped controller A & B connection in shelf #1. Controller A sees
everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything.
Manually pulled a drive from a shelf and plugged it back in.
Verified I am not at any maximum number of drives or shelf count per
cluster limit.
- Verified both heads have the same shelf/drive firmware and
qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
--
Jeff Cleverley IT Engineer
4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
Sounds like you may be hitting http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=471753 (after mentioning “removing ownership in maintenance mode” …)
From: Jeff Cleverley [mailto:jeff.cleverley@avagotech.com] Sent: Tuesday, March 10, 2015 8:48 AM To: Borzenkov, Andrei Cc: Toasters@teaparty.net Subject: Re: Shelf recognition problem with DS14s
On Mon, Mar 9, 2015 at 11:25 PM, Borzenkov, Andrei <andrei.borzenkov@ts.fujitsu.commailto:andrei.borzenkov@ts.fujitsu.com> wrote: It could be that disks have SCSI reservation set on them that prevents partner from accessing them. At least, symptoms do match exactly. I believe there are diagnostic commands to display reservations but I do not have them handy.
I'll look into it, but I doubt it. All of these shelves were on a different cluster until ~9 months ago. They were taken down and ownership removed in maintenance mode. There were 7 other loops with 6 shelves each and all of those have been fine. This one is fine for one filer but the other refuses to see them.
I would try to
a) Make sure you have up to date qual_devices on both nodes; update if needed Already done.
b) halt -f to prevent takeover - halt both partners Not going to happen anytime soon. No downtime will be scheduled for several months at a minimum.
c) boot in maintenance mode on both
d) perform “storage release disks” on both nodes
e) reboot both nodes
What NetApp support says?
Similar type of thing. They were focused more on the shelves. They wanted me to pull every controller on every shelf in the loop. I pulled shelf 1 and shelf 6 with no luck. Since the other controller sees the same shelves through the same shelf interconnect cables, it tells me those are not the problem.
In the mean time I've removed all disk ownership for every disk in that loop. It will have to wait until I can get a downtime.
Thanks,
Jeff
From: toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.netmailto:toasters-bounces@teaparty.net] On Behalf Of Jeff Cleverley Sent: Tuesday, March 10, 2015 2:42 AM To: <Toasters@teaparty.netmailto:Toasters@teaparty.net> Subject: Shelf recognition problem with DS14s
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a: 82 : 0.0GB 0B/sect (Startup failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
1. New FC ports on the filer side. 2. Reseating the shelf interface cards on both shelves 1 and 6 where the FC connections come in. 3. New fibre cable between shelf and filer. 4. Swapped controller A & B connection in shelf #1. Controller A sees everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything. 5. Manually pulled a drive from a shelf and plugged it back in. 6. Verified I am not at any maximum number of drives or shelf count per cluster limit. 7. Verified both heads have the same shelf/drive firmware and qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
-- Jeff Cleverley IT Engineer 4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
I don't believe this to be the case. Generally you will be able to see the drives but not be able to own them or assign ownership to them. In my case the disk show -v -z command looking for that controller only sees 5 disks in the entire loop.
Thanks,
Jeff
On Mon, Mar 9, 2015 at 11:53 PM, Borzenkov, Andrei < andrei.borzenkov@ts.fujitsu.com> wrote:
Sounds like you may be hitting http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=471753 (after mentioning “removing ownership in maintenance mode” …)
*From:* Jeff Cleverley [mailto:jeff.cleverley@avagotech.com] *Sent:* Tuesday, March 10, 2015 8:48 AM *To:* Borzenkov, Andrei *Cc:* Toasters@teaparty.net *Subject:* Re: Shelf recognition problem with DS14s
On Mon, Mar 9, 2015 at 11:25 PM, Borzenkov, Andrei < andrei.borzenkov@ts.fujitsu.com> wrote:
It could be that disks have SCSI reservation set on them that prevents partner from accessing them. At least, symptoms do match exactly. I believe there are diagnostic commands to display reservations but I do not have them handy.
I'll look into it, but I doubt it. All of these shelves were on a different cluster until ~9 months ago. They were taken down and ownership removed in maintenance mode. There were 7 other loops with 6 shelves each and all of those have been fine. This one is fine for one filer but the other refuses to see them.
I would try to
a) Make sure you have up to date qual_devices on both nodes; update if needed
Already done.
b) halt -f to prevent takeover - halt both partners
Not going to happen anytime soon. No downtime will be scheduled for several months at a minimum.
c) boot in maintenance mode on both
d) perform “storage release disks” on both nodes
e) reboot both nodes
What NetApp support says?
Similar type of thing. They were focused more on the shelves. They wanted me to pull every controller on every shelf in the loop. I pulled shelf 1 and shelf 6 with no luck. Since the other controller sees the same shelves through the same shelf interconnect cables, it tells me those are not the problem.
In the mean time I've removed all disk ownership for every disk in that loop. It will have to wait until I can get a downtime.
Thanks,
Jeff
*From:* toasters-bounces@teaparty.net [mailto: toasters-bounces@teaparty.net] *On Behalf Of *Jeff Cleverley *Sent:* Tuesday, March 10, 2015 2:42 AM *To:* Toasters@teaparty.net *Subject:* Shelf recognition problem with DS14s
Greetings,
My apologies for the long email but I'm trying to put in as many details as possible.
I have some DS14 shelves attached to some 6290s running 8.1.2P4 7-mode. I've run into an odd problem that I think will require a takeover/giveback, but wanted to see if I've missed anything. I'm pretty sure the problem is not the shelf IDs :-)
Controller A sees all disks and shelves through the loop just fine. Controller B sees both paths, but for the shelf number and bays, it only shows question marks for those. All connections are on PCI HBAs, nothing is using the onboard ports.
Here is an example of the storage show disk -p command from Controller B:
9d.82 B 8d.82 A ? ?
Here is the same path on controller B using a sysconfig -a:
82 : 0.0GB 0B/sect (Startup
failed.)
As you can see, it sees device 82 down both paths, but it can't identify the shelf or the drive bay. When I look on controller A, it has valid information. If I at a disk from controller B, it won't see any information on the disk like serial number, disk size, firmware, etc. Controller A sees everything just fine and does not have any Startup failed messages.
If I do a storage show disk -n, controller A sees 84 drives, but controller B only sees 4. Those 4 don't show correctly either. If I manually pull a drive and reseat it, it will show up as an unowned drive with no valid shelf or bay slot. It will show up correctly in the sysconfig -a output.
Here are things I've tried with no luck.
New FC ports on the filer side.
Reseating the shelf interface cards on both shelves 1 and 6 where the
FC connections come in.
New fibre cable between shelf and filer.
Swapped controller A & B connection in shelf #1. Controller A sees
everything correctly down the same path that controller B doesn't. Controller B doesn't see anything correctly down the path controller A was using and seeing everything.
Manually pulled a drive from a shelf and plugged it back in.
Verified I am not at any maximum number of drives or shelf count per
cluster limit.
- Verified both heads have the same shelf/drive firmware and
qual_devices packages.
The only thing I can think of is somehow the registry or something on controller B has corrupted. Since I have a head in an odd condition for this loop of disks only I'm a little reluctant to do a takeover and giveback. What if it comes back and doesn't recognize any of its drives :-)
Thanks,
Jeff
--
Jeff Cleverley IT Engineer
4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611
--
Jeff Cleverley IT Engineer
4380 Ziegler Road Fort Collins, Colorado 80525 970-288-4611