Hi all,
The theory of operation of a watchdog card is a counter that the unit has to
reset at regular intervals. In case the counter reaches the maximum value
because ONTAP has not reset the counter, the watchdog will reboot the unit.
I'm talking off the top of my head, I'm sure someone could point out what
the exact behaviour of the watchdog card is.
In other words, the unit crashed and become unresponsive and the watchdog
rebooted the unit. Disabling the watchdog will not reboot the unit and will
leave the head of the filer just crashed, so the filer will stop servicing
files until manual intervention is done (power cycle).
In my opinion, it would be better to have the watchdog activated and
investigate why the unit crashed. If the unit is rebooted because of the
watchdog several times, maybe there could be a bug in that particular
version of ONTAP (check NOW for possible bugs).
To investigate the failure, look at the log messages before the crash in the
autosupport, that may help troubleshooting the issue.
Regards,
Oscar
-----Mensaje original-----
De: En nombre de Tavis Gustafson
Enviado el: 01 June 2005 19:40
Para: orhernandez(a)gmail.com
Asunto: F840 reboot from watchdog reset
Using ontap 6.4.5 with 1 ds14 over copper fc
Last night one of our F840's rebooted itself twice via "watchdog reset".
Either before or after the second reboot notification mail was sent the
filer froze up with the front LCD panel stuck at some NFS ops/sec number.
After a hard reboot it came up fine. I disabled the watchdog timer and the
machine stayed up.
Has anyone experienced multiple watchdog timer resets or knows what type of
hardware failure they watch? Also, how bad is it to keep the watchdog
turned off?
Thanks,
Tavis