The NOW website recommends a raid group size of 12 when using dual parity on a R100 with 4 shelves of 12 disks each, presumably for performance reasons. However, the minimum number of disks per shelf within a group size of 12 is three, so that even with dual parity, if you lose one shelf, you lose the entire raid group. In fact, if you build 4 groups of size 12 (well, OK, one of these would have 10, if you wanted to save aside two hot spares), then losing one shelf would mean you would lose all 4 plexes!
We've had to replace a shelf on the same R100 twice in the last two years. Thankfully, we did so prior to a complete failure. My point is that such a failure is not necessarily rare.
To guard against a shelf failure, it seems prudent to use a raid group size of 8, with DP, and layout the disks so that no more than two disks of any one raid group are on the same shelf.
But I'm hesitant to go against NetApp's recommendation, and concerned the performance hit will be too big. Currently, we're only serving home dirs via CIFS and NFS, with CPU usage floating between 25 and 50% most of the time. But we're considering configuring a LUN for usage by an exchange server. Plus, I'd like to move to 7G (currently at 6.4.5), and use a large aggregate, with flexvols, which will be yet another performance hit (due to the extra layer of software). Our R100 isn't disk bound right now, so I don't anticipate any performance wins from the extra spindles/volume. The raid group size of 8 uses 4 disks less for data (in a maximum disk usage configuration, with 2 hot spares) than the layout with rg size of 12, which is palatable for our situation. And I like the idea of flexible sized volumes.
Comments?