There doesn't appear to be a any invalid CRC's at this point. The unit is booted in readonly mode. The only numbers showing up are for drive .93 with link failure counts. Does that indicate a bad drive?
-thanks so much.
Loop Link Underrun Loss of Invalid Frame In Frame Out ID Failure count sync CRC count count count count count 8b.93 4 0 9 0 34700 264468 8b.92 0 0 0 0 62193 618089 8b.91 0 0 1 0 99539 925291 8b.90 0 0 0 0 93475 868841 8b.89 0 0 0 0 93726 868906 8b.88 0 0 3 0 92456 863458 8b.87 0 0 0 0 2162 1020 8b.86 0 0 2 0 93614 867250 8b.85 0 0 0 0 131 167 8b.84 0 0 0 0 62299 616393 8b.83 0 0 0 0 61890 617720 8b.82 0 0 0 0 61987 616789 8b.81 0 0 6 0 61901 617655 8b.80 0 0 3 0 25407 338325 8b.ha 0 0 1 0 8084399 845489
On Fri, Feb 04, 2005 at 05:55:25AM -0800, McCarthy, Tim wrote:
While the system is up and running, try a "fcadmin link_stats 8b" Look at the CRC column.
It could be a bad controller on a disk or a ESH/LRC. Look for the first device to show frame errors. If it is the beginning of the shelf, it may be the ESH/LRC. Swap it out.
If it is all of them, it could be a bad FC card, in which case, you need to swap it out.
As always, please open a case --tmac
-----Original Message----- From: Tavis Gustafson [mailto:tavis@hq.newdream.net] Sent: Friday, February 04, 2005 8:30 AM To: toasters@mathworks.com Subject: Fibre Channel Woes
I'm running an 840 with a DS14 ( 144GB disks ) connected to each other with a dual port optical fibre channel card. Early this morning the filer went down and apon reboot i started seeing lots of fibre channel frame errors. I swapped optical cables and tried the second port on the card but the errors came back when rebuilding.
I am trying to determine if the problem is the the fibre channel card or with the LRC in the disk shelf. Anybody know if this is indicatd in these error messages ?
Thanks for any help -Tavis
Volume State Status Options boot online reconstruct root, raidsize=7 amplifier> Fri Feb 4 11:55:07 GMT [FastEnet-10/100/e3c:notice]: uid 30358 tid 1: disk quota exceeded on volume boot. onal warnings will be suppressed for approximately 60 minutes or until either a 'quota resize' is performed. Fri Feb 4 11:55:34 GMT [download.updateDone:info]: Bootblock update completed Fri Feb 4 11:56:35 GMT [FastEnet-10/100/e3c:notice]: uid 51154 tid 1: disk quota exceeded on volume boot. Additiona gs will be suppressed for approximately 60 minutes or until either a 'quota resize' is performed.
amplifier> Fri Feb 4 12:00:00 GMT [kern.uptime.filer:info]: 12:00pm up 7 mins, 935006 NFS ops, 0 CIFS ops, 0 HTTP o FS ops, 0 FCP ops, 0 iSCSI ops
amplifier> Fri Feb 4 12:08:09 GMT [wafl_hipri:notice]: uid 24271 tid 1: disk quota exceeded on volume boot. Additio ings will be suppressed for approximately 60 minutes or until either a 'quota resize' is performed. Fri Feb 4 12:13:59 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80: request failed after try #1: cdb 0x1c. Fri Feb 4 12:14:23 GMT [scsi.cmd.checkCondition:error]: Device 8b.82: Check Condition: CDB 0x2a:09b24e98:0080: Sense SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0 0x3)(57649). Fri Feb 4 12:14:23 GMT [scsi.cmd.checkCondition:error]: Device 8b.82: Check Condition: CDB 0x2a:09b24e18:0080: Sense SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0 0x3)(57657). Fri Feb 4 12:14:31 GMT [scsi.cmd.retrySuccess:info]: Device 8b.82: request successful after retry #1: cdb 0x2a:09b24 . Fri Feb 4 12:14:31 GMT [scsi.cmd.retrySuccess:info]: Device 8b.82: request successful after retry #1: cdb 0x2a:09b24 . Fri Feb 4 12:14:31 GMT [wafl_lopri:warning]: NFS response to client 10.3.38.28 was slow, op was v3 read, 63 > 60 (in ) Fri Feb 4 12:14:48 GMT [scsi.cmd.checkCondition:error]: Device 8b.84: Check Condition: CDB 0x2a:09b25418:0080: Sense SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0 0x3)(16996). Fri Feb 4 12:14:51 GMT [scsi.cmd.checkCondition:error]: Device 8b.84: Check Condition: CDB 0x2a:09b25418:0080: Sense SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0 0x3)(20148). Fri Feb 4 12:14:53 GMT [ispfc_timeout_1:warning]: 8b.81 (0x01000051) (0x034f47b0,0x2a:09b25298:0080,0/0,20150/0/0,80 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:15:33 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80: request failed after try #1: cdb 0x1c. Fri Feb 4 12:15:40 GMT [telnet_0:info]: root logged in from host: 10.3.67.21 vol status Volume State Status Options Fri Feb 4 12:16:19 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80: request failed after try #1: cdb 0x1c. Fri Feb 4 12:16:37 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.84: request failed after try #3: cdb 0x2a:09b25418 Fri Feb 4 12:16:37 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.81: request failed after try #1: cdb 0x2a:09b25298 Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.92 (0x0100005c) (0x034f4d00,0x2a:09b25698:0080,0/0,19912/0/0,87 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.85 (0x01000055) (0x034f5690,0x2a:09b25218:0080,0/0,11587/0/0,87 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.84 (0x01000054) (0x034f6020,0x2f:0025a800:0400,0/0,20143/0/0,87 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.83 (0x01000053) (0x034f5be0,0x2a:09b25298:0080,0/0,20153/0/0,87 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.82 (0x01000052) (0x034f6350,0x2a:09b25418:0080,0/0,20104/0/0,87 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.81 (0x01000051) (0x034f6ac0,0x2f:0025ac00:0400,0/0,20153/0/0,87 ommand timeout, quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.85 (0x01000055): global device timer timeout, initiating device r Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.92: Command aborted by host adapter: HA status 0x4: a:09b25698:0080. Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.92: Command aborted by host adapter: HA status 0x4: f:0025ac00:0400. Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.92: request failed after try #1: cdb 0x2a:09b25618 Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.92: request failed after try #1: cdb 0x2a:09b25518 Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.84 (0x01000054): global device timer timeout, initiating device r Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.85: Command aborted by host adapter: HA status 0x4: f:0025a800:0400. Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.85: request failed after try #1: cdb 0x2a:09b25098 Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.85: Command aborted by host adapter: HA status 0x4: 8:109ed500:0040. Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.85: Command aborted by host adapter: HA status 0x4: a:09b25218:0080. Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.83 (0x01000053): global device timer timeout, initiating device r Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.84: request failed after try #1: cdb 0x2f:0025a800 Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.82 (0x01000052): global device timer timeout, initiating device r Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.83: Command aborted by host adapter: HA status 0x4: a:09b25298:0080. Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.83: request failed after try #1: cdb 0x2a:09b25198 Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.83: Command aborted by host adapter: HA status 0x4: f:0025ac00:0400. Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.81 (0x01000051): global device timer timeout, initiating device r Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.82: Command aborted by host adapter: HA status 0x4: f:0025ac00:0400. Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.82: Command aborted by host adapter: HA status 0x4: a:09b25398:0080. Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.82: Command aborted by host adapter: HA status 0x4: a:09b25418:0080. Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.81: Command aborted by host adapter: HA status 0x4: f:0025ac00:0400. Fri Feb 4 12:16:54 GMT [ispfc_timeout_1:warning]: 8b.80 (0x01000050) (0x034f58b0,0x1c,0/0,56193/0/0,8745/0): command , quiescing drive to allow outstanding I/O to complete. Fri Feb 4 12:17:00 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80: request failed after try #1: cdb 0x1c. Fri Feb 4 12:17:08 GMT [scsi.cmd.retrySuccess:info]: Device 8b.85: request successful after retry #1: cdb 0x28:109ed . Fri Feb 4 12:17:25 GMT [scsi.cmd.retrySuccess:info]: Device 8b.82: request successful after retry #1: cdb 0x2a:09b25 . Fri Feb 4 12:17:25 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.89: request failed after try #1: cdb 0x28:10bc5c38 Fri Feb 4 12:17:25 GMT [scsi.cmd.checkCondition:error]: Device 8b.83: Check Condition: CDB 0x2a:09b25298:0080: Sense SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0 0x3)(42100). F PANIC: raid volfsm: vol boot: fatal multi-disk error. in process config_thread on release NetApp Release 6.4.5 on Fri 12:17:25 GMT 2005