Several people have asked for my script, so I've put it on a web page. You can find it here: http://www.cuddletech.com/netapp/ Everything you need to know if on that page. Let me know if anyone has trouble with it, it should be very easy to use.
Ben Rockwood
-----Original Message----- From: Chris Blackmor [mailto:chris.blackmor@amd.com] Sent: Tue 4/23/2002 1:04 PM To: Ozzie Sabina Cc: Sam Schorr; Chris Blackmor; Server Team; toasters Subject: Re: disappearing disks
This happens all the time. Older disks are more prone to this. I have a script that runs every Sunday and checks for errors in the messages files (any disk with more than 5 errors in a week gets manually failed) and then it checks the "fcadmin device_map" for "XXX". If that shows it flags it. Nothing special but it is something to let me know if a disk has spun down. This, for obvious reasons, doesn't work on scsi shelves (but I only have 8 of those left and I am working on removing them from service now.
You could do what Sam mentioned with the disk counts too. Either way will work. C-
On Tue, Apr 23, 2002 at 03:56:43PM -0400, Ozzie Sabina wrote:
+-- "Sam Schorr" sschorr@homestead-inc.com once said: | Thanks Chris. Actually, we have seen this on our F840's as well, which have |>36 Gb disks, but mostly on the 18's. We have a script that gets the disk cou |>nt from the MIB's and compares that to what the total should be - if there is |> a discrepancy, we know what's happened. There won't be a discrepancy if the |> disk had failed as it should because it will still be counted in the total.
We actually had this happen with a 9GB drive on a 740 when it was powered down and back up once (months ago). I just figured it was a one-time thing (and this thing was running an ancient version of the OS), but hearing this from you all now...well, I'm a little concerned about the 760s and 840s we have in production.
Any chance someone want to share any scripts they have written to look for this - it'd be helpful as a starting point at least.
Thanks,
Oz
Hi,
I get an error message when trying this: snmpget - no community name. What version of snmpget are you using. The one I downloaded now is 5.0.pre3 from net-snmp.
It expects to see: USAGE: snmpget [OPTIONS} AGENT OID [OID].... I didn't see in the script that you supply the OID and the community name is not proceeded by the -c flag.
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com]On Behalf Of Ben Rockwood Sent: Tuesday, April 23, 2002 11:43 PM To: Chris Blackmor; Ozzie Sabina Cc: Sam Schorr; Chris Blackmor; Server Team; toasters Subject: Scripts (was RE: disappearing disks)
Several people have asked for my script, so I've put it on a web page. You can find it here: http://www.cuddletech.com/netapp/ Everything you need to know if on that page. Let me know if anyone has trouble with it, it should be very easy to use.
Ben Rockwood
-----Original Message----- From: Chris Blackmor [mailto:chris.blackmor@amd.com] Sent: Tue 4/23/2002 1:04 PM To: Ozzie Sabina Cc: Sam Schorr; Chris Blackmor; Server Team; toasters Subject: Re: disappearing disks
This happens all the time. Older disks are more prone to this. I have a script that runs every Sunday and checks for errors in the messages files (any disk with more than 5 errors in a week gets manually failed) and then it checks the "fcadmin device_map" for "XXX". If that shows it flags it. Nothing special but it is something to let me know if a disk has spun down. This, for obvious reasons, doesn't work on scsi shelves (but I only have 8 of those left and I am working on removing them from service now.
You could do what Sam mentioned with the disk counts too. Either way will work. C-
On Tue, Apr 23, 2002 at 03:56:43PM -0400, Ozzie Sabina wrote:
+-- "Sam Schorr" sschorr@homestead-inc.com once said: | Thanks Chris. Actually, we have seen this on our F840's as well, which
have
|>36 Gb disks, but mostly on the 18's. We have a script that gets the
disk cou
|>nt from the MIB's and compares that to what the total should be - if
there is
|> a discrepancy, we know what's happened. There won't be a discrepancy
if the
|> disk had failed as it should because it will still be counted in the
total.
We actually had this happen with a 9GB drive on a 740 when it was powered down and back up once (months ago). I just figured it was a one-time thing (and this thing was running an ancient version of the OS), but hearing this from you all now...well, I'm a little concerned about the 760s and 840s we have in production.
Any chance someone want to share any scripts they have written to look for this - it'd be helpful as a starting point at least.
Thanks,
Oz
-- ---------------------------------------------------------------------------- - * | * * Chris Blackmor _______ | Good judgment comes from * * Advanced Micro Devices ____ | | experience * * Phone: (512) 602-1608 /| | | | And a lot of that comes * * Fax: (512) 602-5155 | |___| | | from * * Email: chris.blackmor@amd.com |____/ | | bad judgment! * * | Author Unknown* ---------------------------------------------------------------------------- - * My comments are mine, and mine alone. * ---------------------------------------------------------------------------- -