Hello Jerry
I too am astonished, but it was definitely it. The split second it was done rebuilding the problem went away. I was sitting there recalling my command history and my co-worker was running various ls commands.
Things that come into my head: -Unfixed bug *79418: *When the option raid.reconstruct.perf_impact is low, the FilerView RAID Reconstruct Speed is high; and when the option raid.reconstruct.perf_impact is high, the FilerView RAID Reconstruct Speed is low. -Reconstructing a parity should be much faster than a data disk reconstruct, because a parallel reading of the user data doesn't require an out-of-band reconstruct. Only the reconstruct is done sequentially. -Maybe you faced an other additional reconstructable diskfailure which forced a higher cpu-consuming wafl-ironing/filesystem-checking? -May you had physical problems on the disk or fc-layer? command: fcadmin -I will try do reconstruct your problem next week.
I did not run any diagnostic commands such as statit. By the time we tracked it down to being the netapp in the first place, we were busy failing over apps to another site and scurrying around. I think we are using 6.4.1 (not connected to work right now). Other volumes were not effected. Have you ever yanked a parity disk?
We yank everything. :-) We have 16 training filers ( 8 per class ) and I give appr. two classes/month. We usually kill all kinds of disks on all filers. Some kill single disks (data or parity), some force multiple disk errors by using disk fail or pulling them out physically. So we have an average of two parity failures per week. And yes, we use "hammer", "sio" and other load generation tools. ;-)
So I can tell you, that pulling out the two mailbox-disks of the root-volume at the same time will panic your filer even if you have raid-dp. The filer needs a delay of 10 seconds to activate a replacement disk and stamp it as a mailboxdisk before the second mbx disk is allowed to fail. If not, the filesystem still can be reconstructed, but the clustermailbox information is lost. => Spread your root volume over multiple FCs and use SyncMirror to have a 4 mailbox disks redundancy for 99,99...% highavailability.
Best regards Dirk