kevin graham wrote:
On Wed, May 16, 2001 at 12:49:40PM -0700, Fox, Adam wrote:
It is suggested if you are hooking up a dual path loop that the B loop start at the opposite end of the loop as the A loop. In other words, the B loop starts on the last shelf on the A loop and works it's way back.
Why is that loop configuration suggested?
Since the FC9 shelves are self-terminating, it would make sense to run the loops in opposite directions -- that way if there is a failure in a shelf, you'd maintain connectivity around the broken loop.
The loops are independent of each other, therefore, if there were a break in loop A, filer B would take over on the B channel. This is independent of the A channel. If the LRC/EDM fails on one channel, a failover must take place. If you loose an entire shelf due to power supply or other catastrophic failure, other bigger, more complex problems arise. Clustering is not likely to be the saving grace.
A shelf does not necessarily constitute a volume especially after a filer has been in service for some time. If a shelf were to be lost, it may effect more than one volume and a race to rebuild on the spares pool begins.
IE:
A Filer +-------------- A B | | +Shelf 1+ | | | | +Shelf 2+ | | | | +Shelf 3+ | +---------+
If shelf2 flips out, you s3 is still visible on loop B, and s1 is visible on loop A. (continue example for n number of shelves, and it still holds).
As nice as this is, I'm not sure how much it really accomplishes, since if you lose a shelf worth of disks, its probably going to equate to a double disk failure in atleast one volume, taking the filer offline anyways..
Has this behavior changed in the last year or so? Can the filer mark a 'dead' volume as such and continue serving the surviving volumes? (at worst with perhaps a reboot after one dies to reset hardware?) Obviously this would change in the case of a cluster, where it'd be better to go offline altogether and let the partner give a shot at using the disks..
Just a thought...
..kg..