A week later we shut it down to replace an fddi card and afterwards it wouldn't boot up. The failed drive was apparently working well enough so that the Netapp thought it had a RAID drive that wasn't a valid member of the array (inconsistent disk labels). Once we removed the problem drive the Netapp booted just fine.
I'd have to say the above is not "normal" behavior. But from what I've seen, what makes the difference is how a drive fails.
Normally, if the drive fails, and reconstruction happens normally, the drive will look bad to the system and will be unuseable. This gives you time to swap out the drive with a new one.
If the system reboots, and the drive has failed badly enough, it will fail initialization on boot (it will say disk so-and-so is broken) so it won't appear to the system as a spare. The system will boot fine (other than the fact that it's probably in degraded mode since it lost a disk.) I think this is pretty much how it's "supposed" to work.
Often, if the drive only had a minor failure, the disk will look fine upon reboot, and will get marked as a spare. This is known bad behavior that should be fixed.
If the system fails in an unusual way, you can get the "inconsistent label" or similar problem. I've usually only seen this if the system crashes immediately after it tries to fail a drive and was in the process of switching over to reconstruction or degraded mode. My understanding as a customer is that incidents like this are also bugs that should be fixed.
Also, there are times where a SCSI problem can cause a drive to look bad, and the system attempts to fail the drive, but winds up rebooting shortly thereafter due to bus problems and the drive comes back fine. Although it "failed", it never got far enough to actually fail the drive, and WAFL replay succeeds, so no data is lost.
Clearly an issue here is to make sure when a drive fails, it's actually marked as BAD in some way. However, if the drive has failed in a most spectacular way, writing a "BAD" label onto the drive may be impossible. Despite the online description of bug 961, this isssue is included within that, along with proactively failing a drive that looks like it "needs" to be replaced given the frequency of errors.
Bruce