Nicholas,
You are not being too paranoid. I removed clustering from all of our systems after we had repeated events (over a couple years each different and each "rare") that for one reason or another brought down two filers instead of just one. I must add here that this was shortly after clustering was offered (F630's using FC) although it continued up to F760's. I am not saying that clustering is bad, in fact when it worked it worked great. It does add another level of complexity to managing your systems and previously there were a lot of ways you could misconfigure yourself into a real problem. I have not had clustered systems for almost 2 years now and I do not miss them - but I am sure it has improved as most everything else has with the systems being produced by Netapp.
It was a hard decision to make, but I reasoned that if you do not failover and it works who cares, it's when you need to failover and can't, or worse, bring down the partner also, that you are better off without it. Because really if you have a spare filer head with the right chip Intel/Alpha you could just put it in and go without worrying about any other dependencies.
But to answer your specific question about shelves, I have never seen a shelf go bad and I am guessing it would take some external event to cause it to, such as excessive heat/cold or banging on it with a hammer. I have seen a disk (based on the RCA done by Netapp)chatter enough on a loop as it was going bad to bring down both system loops and hence both filers though.
Jason
-----Original Message----- From: Nicholas Chua [mailto:nicholas_chuah@yahoo.com] Sent: Thursday, April 18, 2002 12:36 AM To: toasters@mathworks.com Subject: Dead Disk Shelf for One of The Clustered Filer
Hi,
Is it possible for the disk shelf to completely fail? By this I don't mean the power supply or fans because those are redundant and replacable. I thinking more of a board/ic failure or maybe shelf firmware bug that can completely knock out a disk shelf. If there is a remote possibility of this happening, then I figure the partner would not be able to take over the failed filer through channel B as well under a cluster environment. Am I being too paranoid?
I'm looking for something like the Mean Time Between Failure (MTBF) for the Disk Shelf if available, be it FC9 or DS14.
Any input would be greatly appreciated.
Nic
__________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/