On an EMC if a disk goes out on one side of a mirror (we'll call it side A), side A will be offline (all of the disks in side A, not just the bad drive). Therefore, any new writes will be seen on side B but not on the drives of side A and you can't use side A drives to recover from subsequent side B drive failures. My EMC SE said that only if both sides of the mirror have simultaneous drive failures taking them both down at the same time is there a chance of recovering and that it would take a lot of work. (this assumes that the bad drives weren't mirrors of each other)
Brian Tao wrote:
On Mon, 3 Apr 2000, Bruce Sterling Woodcock wrote:
I don't see how? When one mirror loses a disk, the whole thing is lost; you switch over to the other mirror, which has n-1 disks as the RAID 4 array. So your initial chance of failure is 2n-1 and then n-1 for a second failure whereas RAID 4 is just n followed by n-1 for the second failure
To be somewhat anal about it, the initial chance of failure of mirrored partitions is 2(n-1). :)
But you're right, you have a higher probability of being put at risk using a mirror (just needs 1 disk going bad out of 2n-2 disks) than with RAID4 (1 disk bad out of n disks). Once that 1st disk is lost the probabilities of a second failure are the same for both (n-1).
If EMC had actually implemented their mirroring like Brian Tao mentions below then their mirroring would be much more reliable than RAID4.
I think you're talking about RAID 0+1 (taking a RAID-0 set and
mirroring it on another RAID-0 set, which is just silly). RAID 1+0 does the mirroring first, then the striping/concatenation:
+----+----+ +--------------+ +------------+ | 1A | 1B | | RAID-1 set 1 | | | +----+----+ +--------------+ | | | 2A | 2B | | RAID-1 set 2 | | | +----+----+ +--------------+ | | | 3A | 3B | | RAID-1 set 3 | | | +----+----+ +--------------+ | | | 4A | 4B | --> | RAID-1 set 4 | --> | RAID 0 | +----+----+ +--------------+ |(7 "drives")| | 5A | 5B | | RAID-1 set 5 | | | +----+----+ +--------------+ | | | 6A | 6B | | RAID-1 set 6 | | | +----+----+ +--------------+ | | | 7A | 7B | | RAID-1 set 7 | | | +----+----+ +--------------+ +------------+ [etc...]
The A's and B's are drives of a mirrored pair. You could lose,
say, drives 1A, 2A, 4B, 5B and 7A and still have a functional RAID 0, because no single mirror is completely broken. You could lose the entire shelf containing the A drives and not have an outage (something from which today's Netapps cannot automatically recover, clustering or no clustering). Having mirrored pairs in RAID 4 (i.e., having mirrored data and parity drives) and the ability to concurrently rebuild multiple broken mirrors to hot spares would really give Netapp a leg up on other NAS vendors, IMHO.
Oh, are the drives mirror images of each-other too? I didn't realize that. So if Drive 1 in Mirror A fails, breaking the mirror, and then Drive 2 in Mirror B fails, Mirror B will be smart enough to switch over to Drive 2 in Mirror A?
As far as the RAID 0 is concerned, single drive failures within
each mirrored pair does not result in that stripe being down, because the remaining half of the mirror is still online. If you lose, say, drives 4A and 4B in a RAID 1+0, then you're toast. In RAID 1+4 (or 1+5), you would still have the parity drive.
-- Brian Tao (BT300, taob@risc.org) "Though this be madness, yet there is method in't"