New subject: Disk Controller Failures

26 Jun 1999


      On 06/25/99 12:50:17 you wrote:
...
Hi there,
We recently encountered a disk controller failure on one of our
data servers (not on our Netapp).  The problem was that this failure was
not a complete failure of the drive itself, but rather the controller
started to
die slowly.  (i.e. handles some request sometimes and other times who
knows?)
As a result, this caused our entire filesystem to become corrupt and we lost
some 
of our data as a result.  Although this filesystem was mirrored, this did
not help us 
at all, as it was considered a logical error in the filesystem and not a
hardware problem.
My question is that should this occur on a Netapps (this may even
apply to
any other Enterprise server) would it cause the entire filesystem to go
corrupt and
cause partial or complete data loss as in this case?
I would have to say yes, it's *possible*, but the controller would have
to fail in a very odd way; not simply not responding the some requests
(that would be caught), but executing some and not others (but claiming
it did) or misordering commands or something like that.  I would think
this kind of error would be very rare.  Perhaps the controller you had
is particularly prone to those sorts of errors; one advantage of Netapp
is you're using controllers they themselves have partially designed and
tested.
Bruce

Re: Disk Controller Failures