RAID not reconstructing on FAS3050c - toasters

9 May 2013


      Hello,
We've got a couple of FAS3050c filers running 4 DS14mk2 shelves full of
disks (each filer is connected to two disk shelves). They've mostly been
trouble-free, but this week it seems a disk failed, and our main storage
aggregate went into "degraded" mode. For some reason, despite spare disks
being available, it's not reconstructing as I would think it should.
The software running on these filers is Data ONTAP GX 10.0.1P2 -- from
previous discussions with the community, I've learned that GX has a whole
different set of commands, so many of the Google-able resources I've found
aren't relevant. Adding to that difficulty, we don't have a support
contract on these filers (but they are properly licensed and whathaveyou).
Here is the output of 'storage aggregate show -aggregate engdata1' (that is
the degraded aggregate):
toast1a::> storage aggregate show -aggregate engdata1
Aggregate: engdata1
          Size (MB): 0
     Used Size (MB): 0
    Used Percentage: -
Available Size (MB): 0
              State: restricted
              Nodes: toast1a
    Number Of Disks: 37
              Disks: toast1a:0a.16, toast1a:0b.32, toast1a:0c.48,
                     toast1a:0a.17, toast1a:0b.33, toast1a:0c.49,
                     toast1a:0a.18, toast1a:0b.34, toast1a:0c.50,
                     toast1a:0a.19, toast1a:0b.35, toast1a:0c.51,
                     toast1a:0a.20, toast1a:0d.64, toast1a:0b.37,
                     toast1a:0a.21, toast1a:0c.52, toast1a:0b.38,
                     toast1a:0a.22, toast1a:0c.61, toast1a:0b.39,
                     toast1a:0d.69, toast1a:0c.54, toast1a:0b.40,
                     toast1a:0a.24, toast1a:0c.55, toast1a:0d.65,
                     toast1a:0a.25, toast1a:0a.26, toast1a:0b.42,
                     toast1a:0c.59, toast1a:0a.27, toast1a:0b.43,
                     toast1a:0a.28, toast1a:0b.45, toast1a:0d.68,
toast1a:0d.71
  Number Of Volumes: 0
             Plexes: /engdata1/plex0(online)
        RAID Groups: /engdata1/plex0/rg0, /engdata1/plex0/rg1,
                     /engdata1/plex0/rg2
          Raid Type: raid_dp
      Max RAID Size: 14
        RAID Status: raid_dp,degraded
   Checksum Enabled: true
    Checksum Status: active
     Checksum Style: block
       Inconsistent: true
       Volume Types: flex
There are spare disks available now, but there were not when the failure
occurred. I moved two spare disks to the right filer after the failure,
thinking that would cause the aggregate to start reconstructing. Here is
the output of 'storage disk show -state spare':
toast1a::> storage disk show -state spare
Disk             UsedSize(MB) Shelf Bay State     RAID Type  Aggregate Owner
---------------- ------------ ----- --- --------- ---------- ---------
--------
toast1a:0d.72    423090           4   8 spare     pending    -
toast1a
toast1a:0d.73    423090           4   9 spare     pending    -
toast1a
toast1b:0d.74    423090           4  10 spare     pending    -
toast1b
toast1b:0d.75    423090           4  11 spare     pending    -
toast1b
toast1b:0d.76    423090           4  12 spare     pending    -
toast1b
toast1b:0d.77    423090           4  13 spare     pending    -
toast1b
6 entries were displayed.
Can anyone provide insight on this problem? Why is the aggregate not
reconstructing when there are spares available? NetApp stuff is not my
specialty, but I'm the one who gets to deal with it, and I am pretty
stumped. Thank you in advance!
--
Chris Daniel