Interesting. I got a different answer from my EMC engineering people. The following is a quote:
"Stripes can exist either within the Sym ("meta-volume") or outside the Sym using OS (or volume management) tools.
Within the Sym, if a stripe is built across Mirrored volumes, the stripe remains available provided that one does not lose both halves of the same mirror. Example, four mirrored pairs (A/a, B/b, C/c, and D/d) with a stripe built across the four. Up to four disks could be lost as long as one of each pair is still good. We will not build a meta-volume such that 2 or more parts reside on the same physical - this violates our configuration rules.
Outside the Sym the same holds true provided that one does not build stripes with multiple pieces coming from the same physical.
The net is - as long as the underlying parity (mirror or otherwise) maintains the availability of the parts of a stripe, the whole stripe will remain available."
--srs
-----Original Message----- From: Steve Gremban [mailto:gremban@msp.sc.ti.com] Sent: Tuesday, April 04, 2000 11:04 AM To: toasters@mathworks. com Subject: Re: EMC Celerra vs NetApp Filer
On an EMC if a disk goes out on one side of a mirror (we'll call it side A), side A will be offline (all of the disks in side A, not just the bad drive). Therefore, any new writes will be seen on side B but not on the drives of side A and you can't use side A drives to recover from subsequent side B drive failures. My EMC SE said that only if both sides of the mirror have simultaneous drive failures taking them both down at the same time is there a chance of recovering and that it would take a lot of work. (this assumes that the bad drives weren't mirrors of each other)
Brian Tao wrote:
On Mon, 3 Apr 2000, Bruce Sterling Woodcock wrote:
I don't see how? When one mirror loses a disk, the whole thing is lost; you switch over to the other mirror, which has n-1 disks as the RAID 4 array. So your initial chance of failure is 2n-1 and then n-1 for a second failure whereas RAID 4 is just n followed by n-1 for the second failure
To be somewhat anal about it, the initial chance of failure of mirrored partitions is 2(n-1). :)
But you're right, you have a higher probability of being put at risk using a mirror (just needs 1 disk going bad out of 2n-2 disks) than with RAID4 (1 disk bad out of n disks). Once that 1st disk is lost the probabilities of a second failure are the same for both (n-1).
If EMC had actually implemented their mirroring like Brian Tao mentions below then their mirroring would be much more reliable than RAID4.
I think you're talking about RAID 0+1 (taking a RAID-0 set and
mirroring it on another RAID-0 set, which is just silly). RAID 1+0 does the mirroring first, then the striping/concatenation:
+----+----+ +--------------+ +------------+ | 1A | 1B | | RAID-1 set 1 | | | +----+----+ +--------------+ | | | 2A | 2B | | RAID-1 set 2 | | | +----+----+ +--------------+ | | | 3A | 3B | | RAID-1 set 3 | | | +----+----+ +--------------+ | | | 4A | 4B | --> | RAID-1 set 4 | --> | RAID 0 | +----+----+ +--------------+ |(7 "drives")| | 5A | 5B | | RAID-1 set 5 | | | +----+----+ +--------------+ | | | 6A | 6B | | RAID-1 set 6 | | | +----+----+ +--------------+ | | | 7A | 7B | | RAID-1 set 7 | | | +----+----+ +--------------+ +------------+ [etc...]
The A's and B's are drives of a mirrored pair. You could lose,
say, drives 1A, 2A, 4B, 5B and 7A and still have a functional RAID 0, because no single mirror is completely broken. You could lose the entire shelf containing the A drives and not have an outage (something from which today's Netapps cannot automatically recover, clustering or no clustering). Having mirrored pairs in RAID 4 (i.e., having mirrored data and parity drives) and the ability to concurrently rebuild multiple broken mirrors to hot spares would really give Netapp a leg up on other NAS vendors, IMHO.
Oh, are the drives mirror images of each-other too? I didn't realize that. So if Drive 1 in Mirror A fails, breaking the mirror, and then Drive 2 in Mirror B fails, Mirror B will be smart enough to switch over to Drive 2 in Mirror A?
As far as the RAID 0 is concerned, single drive failures within
each mirrored pair does not result in that stripe being down, because the remaining half of the mirror is still online. If you lose, say, drives 4A and 4B in a RAID 1+0, then you're toast. In RAID 1+4 (or 1+5), you would still have the parity drive.
-- Brian Tao (BT300, taob@risc.org) "Though this be madness, yet there is method in't"