This is the second time this has happened in the past week. I'm thinking there might actually be a hardware issue somewhere along the FCAL loop but does anyone else have any theories.
from the message log: . . . Mon May 19 00:00:00 EDT [statd]: 12:00am up 2 days, 13:25 18869073 NFS ops, 4573028 CIFS ops, 0 HTTP ops Mon May 19 00:24:25 EDT [isp2100_timeout]: 1.1 (0xfffffc0000d83560,0x2a:00289370:0008,0/0,23228/0/0,58593/0): comman d timeout, quiescing drive. Mon May 19 00:24:29 EDT [isp2100_timeout]: 1.1: global device timer timeout, initiating device recovery. Mon May 19 00:24:29 EDT [isp2100_timeout]: Resetting device 1.1 . . ^ what happened here? . Tue May 20 02:46:55 EDT [isp2100_timeout]: 1.42 (0xfffffc0000d6ed20,0x28:00598770:0010,0/0,8363/0/0,56085/0): comman d timeout, quiescing drive. Tue May 20 02:46:58 EDT [isp2100_timeout]: 1.42: global device timer timeout, initiating device recovery. Tue May 20 02:46:58 EDT [isp2100_timeout]: Resetting device 1.42 Tue May 20 02:47:02 EDT [isp2100_timeout]: Resetting ISP2100 in slot 1 Tue May 20 02:47:22 EDT last message repeated 2 times Tue May 20 02:47:28 EDT [isp2100_timeout]: Loop recovery event generated by device 1.0. Tue May 20 02:47:33 EDT [isp2100_timeout]: Resetting ISP2100 in slot 1 Tue May 20 14:50:21 GMT [rc]: NIS: Group Caching has been enabled Tue May 20 10:50:22 EDT [rc]: e2a: Link up. Tue May 20 10:50:23 EDT [ses_admin]: No SCSI-3 Enclosure Services on host adapter 1 shelf 0. Tue May 20 10:50:23 EDT [ses_admin]: Check drive placement. . . . Tue May 20 10:50:28 EDT [rc]: relog syslog Tue May 20 02:47:53 EDT [isp2100_timeout]: Offlining loop attached to HBA in slot 1. Will try to reco Tue May 20 10:50:28 EDT [rc]: relog syslog Tue May 20 02:47:53 EDT [isp2100_timeout]: The fibre channel loop attache d to adapter 1 has gone down
So I power off the netapp, power off all the disk shelves. Turn them back on, boot the netapp up and it comes up fine, both times same scenario. The previous time a disk had it's red LED lit up so it was easy to tell that it needed replacing. This did not happen this time. I pulled disk 1.42 because I thought it might have been the one causing problems and it's been replaced with a spare. Should I also pull disk 1.1?
Thanks Dan
One other strang thing. I can no longer log into Filer View. It prompts me for the password to the netapp, then once I enter the password it comes back with the following error : "Error communicating with host na4m-be msg=Connection refused: connect". Is the password for Filer View the same as the root password for the netapp. I am still able to telnet in. I tried disabling and enabling the httpd server and that didn't seem to help.