Cool, that's great to hear. Thanks for the response.
Respectfully, David N. Blank-Edelman Director of Technology College of Computer Science Northeastern University
John Denholm johnd@ThePLAnet.net writes:
Yeah, that's better now. We used to get that a lot - part of the problem can be the interaction of Cisco switches and Netapps - the Cisco can take a short while to figure out that the link is back up to a Netapp and the mail falls on the floor while waiting for that.
Some time ago, presumably at the insistence of ourselves and others, Netapp inserted a 30 second wait and retry loop into bootup, and now I think it gives autosupport 4 or 5 tries over a couple of minutes. We always get autosupport off them now. Not that, I admit, we've had a crash message in a while, but we never used to get them on reboots either :<
As for dealing with network or DNS or whatever not coming back, one thing you can do is specify your mailhost wholely by IP - while fractionally more work to maintain if the boxes on your network change frequently, it requires a little less to go right.
If you are worried about missing a crash, use snmp monitoring and grab system.sysUptime.0, which is the uptime in hundredths of seconds. I run a large number of netapps of different breeds, and I just run a perl script which grabs the uptime off every one every 10 minutes. Any crashes are immediately apparent. It could also write to file and compare current uptime to previous uptime - if uptime drops, it can sms you, mail you, sound sires, flash lights, whatever :>
Now I just wish they'd get it right with the caches. They still only try mailing autosupport once :p
J
# John Denholm johnd@theplanet.net #
# Webcache & Filer Administrator, Planet Online +44 113 207 6357 #
Error 404: There is no spoon