I received an update from Network Appliance Services Engineering on this issue. It contains two links I found helpful. Anissa gave permission for me to post this to toasters.
Also, I should mention that the "unrecovered read error" message I've been getting is a bit misleading. I failed to mention that these were always followed by the "Scrub rewriting bad data block". Since the data was succesfully reconstructed, there was no data loss. In that sense, the errors were actually recoverable.
------------- Begin Forwarded Message -------------
From: "Mohler, Anissa" Anissa.Mohler@netapp.com To: "'caron@sig.com'" caron@sig.com Cc: "Mohler, Anissa" Anissa.Mohler@netapp.com Subject: Medium Errors Date: Thu, 15 Aug 2002 09:38:30 -0700 MIME-Version: 1.0
Hi Paul,
The information given to you was in error. Medium errors at the rate you describe (e.g. 1 per week) is perfectly OK... there is a 'bug' you can look up on the NOW site that describe this in a bit more detail:
bug 68517 http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=68517
We're also working on publishing more information about understanding and handling medium errors to the now site.
Our apologies for the confusion.
Data ONTAP will warn you if the medium errors are an issue... the function called Storage Health Monitor monitors the error frequency and will recommend failing the drive if needed. If you want to read up on SHM check out the following: http://now.netapp.com/NOW/knowledge/docs/ontap/rel621/html/sag/appendi4.htm #1154205
Hope this helps :)
Anissa Mohler Network Appliance Services Engineering
-------------------------------------------
--- Anissa Mohler - 408.822.6404 - NetApp Services Engineering - RAS ---
------------- End Forwarded Message -------------