This sounds more like a general fiber channel error, perhaps from a bad LRC or cable, or card. You should open up a ticket with Network Appliance, or at the very least boot into maintenance mode and run some of the detailed fiber channel tests from the 1-5 menu.
On Wed, 24 Jul 2002, Geoff Hardin wrote:
| I have recently seen surge in the number of disk failures on an F760 | cluster in our datacenter. We have four F760's purchased a few months | apart; one cluster has seen 12 disks fail in the past three months while | the other cluster has seen two disks fail (I can only actually remember | one, but I'm allowing for my failing memory as well). | | The clusters sit only a few feet apart so I am discounting | environmental problems. Both clusters are running NetApp Release | 6.1R1P1. The "bad" cluster is using mostly Seagate ST318203FC 18 GB | disks, while the "good" cluster is mostly the Seagate ST118202FC drives | (the infamous spin-up problem disks). The good cluster is unbalanced; | one head has 52 disks and the other has 32. The bad cluster is evenly | balanced (or it was before we started losing disks en masse) with 42 on | each side. Both clusters are running disk firmware NA10 for the | ST318203FC disks (I know, I just discovered that it's one rev out of | date) and NA27 for the ST118202FC disks. | | The strangest part of this whole situation is that the disks rarely | fail; they disappear and the partner complains that there is a cluster | mismatch, breaks clustering, and sends out an email. The filer with the | missing disk starts to rebuild (if it was a data disk) or merrily goes | on its way (if it was a spare), but nothing ever shows up as broken. | The disk just disappears. | | Short of going through and replacing every piece of hardware in the | "bad" filers, I am at a loss of how to proceed. I've spent the morning | searching NOW without luck. [Someone just pointed out to me that we | have a few X221_ST318304FC disks with NA06 firmware in several of our | filers, not just the good and bad clusters I've been describing, opening | us up to bug 27068 (we're trying to schedule downtime to upgrade the | firmware on all our filers now).] I am going to try upgrading the disk | firmware on the filers as a first step, but if anyone else has seen this | problem, or something similar, I would appreciate any input. | | Geoff Hardin | geoff.hardin@dalsemi.com | If it's glowing, don't eat it... |