I don’t know that I would try to do it deliberately but OnTap does it automatically when you have enough shelves.
And I have seen this save a panic when a whole shelf failed. Facilities were testing a power rail without checking that the power supplies in all devices were operational. The engineer was opening the rack with power supply in hand when it went down.
It took a while to rebuild but it didn’t go down.
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of jordan slingerland Sent: Wednesday, 22 June 2016 1:12 AM To: toasters@teaparty.net Subject: Raid disk layout - the ability to lose a shelf.
Hello, I am deploying a new 8040 and it was requested that the aggregates / raid groups are laid out in such a way that no more than 2 disks in any raid group are within the same shelf. At first this sounds like it reduces single points of failure and could protect availability from the failure of a full disk shelf. I argue against this strategy and was wondering if anyone in this list had any feedback. My thought is that this configuration is marginally increasing availability at the sacrifice of additional risk to data integrity. With this strategy, each time a disk failed we would endure not only the initial rebuilt to spare, but a second rebuild when a disk replace is executed to put the original shelf/slot/disk back into the the active raid group. Additional, if a shelf failure were encountered, I question whether it would even be possible to limp along. In an example configuration, we would be down 24 disks, 4 or 5 would rebuild to the remaining spares available. Those rebuilds along should require significant cpu to occur concurrently and I expect would impact data services significantly. Additionally, at least 10 other raid groups would be either single or double degraded. I expect the performance degradation at this point would be so great that the most practical course of action would be to shutdown the system until the failed shelf could be replaced.
Thanks for any input. I would like to know if anyone has any experience thinking through this type of scenario. Is considering this configuration interesting or perhaps silly? Are any best practice recommendations being violated?
Thanks in advance. --Jordan
Duncan Cummings NetApp Specialist Interactive Pty Ltd Telephone +61 7 3323 0800 Facsimile +61 7 3323 0899 Mobile +61 403 383 050 www.interactive.com.auhttp://www.interactive.com.au
-------Confidentiality & Legal Privilege------------- "This email is intended for the named recipient only. The information contained in this message may be confidential, or commercially sensitive. If you are not the intended recipient you must not reproduce or distribute any part of the email, disclose its contents to any other party, or take any action in reliance on it. If you have received this email in error, please contact the sender immediately. Please delete this message from your computer. Confidentiality and legal privilege are not waived or lost by reason of mistaken delivery to you."