There doesn't appear to be a any invalid CRC's at this point. The unit
is booted in readonly mode. The only numbers showing up are for drive
.93 with link failure counts. Does that indicate a bad drive?
-thanks so much.
Loop Link Underrun Loss of Invalid Frame In Frame
Out
ID Failure count sync CRC count
count
count count count
8b.93 4 0 9 0 34700
264468
8b.92 0 0 0 0 62193
618089
8b.91 0 0 1 0 99539
925291
8b.90 0 0 0 0 93475
868841
8b.89 0 0 0 0 93726
868906
8b.88 0 0 3 0 92456
863458
8b.87 0 0 0 0 2162
1020
8b.86 0 0 2 0 93614
867250
8b.85 0 0 0 0 131
167
8b.84 0 0 0 0 62299
616393
8b.83 0 0 0 0 61890
617720
8b.82 0 0 0 0 61987
616789
8b.81 0 0 6 0 61901
617655
8b.80 0 0 3 0 25407
338325
8b.ha 0 0 1 0 8084399
845489
On Fri, Feb 04, 2005 at 05:55:25AM -0800, McCarthy, Tim wrote:
> While the system is up and running, try a "fcadmin link_stats 8b"
> Look at the CRC column.
>
> It could be a bad controller on a disk or a ESH/LRC.
> Look for the first device to show frame errors.
> If it is the beginning of the shelf, it may be the ESH/LRC.
> Swap it out.
>
> If it is all of them, it could be a bad FC card, in which case, you need
> to swap it out.
>
> As always, please open a case
> --tmac
>
> -----Original Message-----
> From: Tavis Gustafson [mailto:tavis@hq.newdream.net]
> Sent: Friday, February 04, 2005 8:30 AM
> To: toasters(a)mathworks.com
> Subject: Fibre Channel Woes
>
> I'm running an 840 with a DS14 ( 144GB disks ) connected to each other
> with a dual port optical fibre channel card. Early this morning the
> filer went down and apon reboot i started seeing lots of fibre channel
> frame errors. I swapped optical cables and tried the second port on the
> card but the errors came back when rebuilding.
>
> I am trying to determine if the problem is the the fibre channel card or
> with the LRC in the disk shelf. Anybody know if this is indicatd in
> these error messages ?
>
> Thanks for any help
> -Tavis
>
>
> Volume State Status Options
> boot online reconstruct root, raidsize=7
> amplifier> Fri Feb 4 11:55:07 GMT [FastEnet-10/100/e3c:notice]: uid
> 30358 tid 1: disk quota exceeded on volume boot.
> onal warnings will be suppressed for approximately 60 minutes or until
> either a 'quota resize' is performed.
> Fri Feb 4 11:55:34 GMT [download.updateDone:info]: Bootblock update
> completed
> Fri Feb 4 11:56:35 GMT [FastEnet-10/100/e3c:notice]: uid 51154 tid 1:
> disk quota exceeded on volume boot. Additiona
> gs will be suppressed for approximately 60 minutes or until either a
> 'quota resize' is performed.
>
> amplifier> Fri Feb 4 12:00:00 GMT [kern.uptime.filer:info]: 12:00pm up
> 7 mins, 935006 NFS ops, 0 CIFS ops, 0 HTTP o
> FS ops, 0 FCP ops, 0 iSCSI ops
>
> amplifier> Fri Feb 4 12:08:09 GMT [wafl_hipri:notice]: uid 24271 tid 1:
> disk quota exceeded on volume boot. Additio
> ings will be suppressed for approximately 60 minutes or until either a
> 'quota resize' is performed.
> Fri Feb 4 12:13:59 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80:
> request failed after try #1: cdb 0x1c.
> Fri Feb 4 12:14:23 GMT [scsi.cmd.checkCondition:error]: Device 8b.82:
> Check Condition: CDB 0x2a:09b24e98:0080: Sense
> SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0
> 0x3)(57649).
> Fri Feb 4 12:14:23 GMT [scsi.cmd.checkCondition:error]: Device 8b.82:
> Check Condition: CDB 0x2a:09b24e18:0080: Sense
> SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0
> 0x3)(57657).
> Fri Feb 4 12:14:31 GMT [scsi.cmd.retrySuccess:info]: Device 8b.82:
> request successful after retry #1: cdb 0x2a:09b24
> .
> Fri Feb 4 12:14:31 GMT [scsi.cmd.retrySuccess:info]: Device 8b.82:
> request successful after retry #1: cdb 0x2a:09b24
> .
> Fri Feb 4 12:14:31 GMT [wafl_lopri:warning]: NFS response to client
> 10.3.38.28 was slow, op was v3 read, 63 > 60 (in
> )
> Fri Feb 4 12:14:48 GMT [scsi.cmd.checkCondition:error]: Device 8b.84:
> Check Condition: CDB 0x2a:09b25418:0080: Sense
> SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0
> 0x3)(16996).
> Fri Feb 4 12:14:51 GMT [scsi.cmd.checkCondition:error]: Device 8b.84:
> Check Condition: CDB 0x2a:09b25418:0080: Sense
> SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0
> 0x3)(20148).
> Fri Feb 4 12:14:53 GMT [ispfc_timeout_1:warning]: 8b.81 (0x01000051)
> (0x034f47b0,0x2a:09b25298:0080,0/0,20150/0/0,80
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:15:33 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80:
> request failed after try #1: cdb 0x1c.
> Fri Feb 4 12:15:40 GMT [telnet_0:info]: root logged in from host:
> 10.3.67.21
> vol status
> Volume State Status Options
> Fri Feb 4 12:16:19 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80:
> request failed after try #1: cdb 0x1c.
> Fri Feb 4 12:16:37 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.84:
> request failed after try #3: cdb 0x2a:09b25418
> Fri Feb 4 12:16:37 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.81:
> request failed after try #1: cdb 0x2a:09b25298
> Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.92 (0x0100005c)
> (0x034f4d00,0x2a:09b25698:0080,0/0,19912/0/0,87
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.85 (0x01000055)
> (0x034f5690,0x2a:09b25218:0080,0/0,11587/0/0,87
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.84 (0x01000054)
> (0x034f6020,0x2f:0025a800:0400,0/0,20143/0/0,87
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.83 (0x01000053)
> (0x034f5be0,0x2a:09b25298:0080,0/0,20153/0/0,87
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.82 (0x01000052)
> (0x034f6350,0x2a:09b25418:0080,0/0,20104/0/0,87
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:16:49 GMT [ispfc_timeout_1:warning]: 8b.81 (0x01000051)
> (0x034f6ac0,0x2f:0025ac00:0400,0/0,20153/0/0,87
> ommand timeout, quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.85 (0x01000055):
> global device timer timeout, initiating device r
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.92:
> Command aborted by host adapter: HA status 0x4:
> a:09b25698:0080.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.92:
> Command aborted by host adapter: HA status 0x4:
> f:0025ac00:0400.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.92:
> request failed after try #1: cdb 0x2a:09b25618
> Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.92:
> request failed after try #1: cdb 0x2a:09b25518
> Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.84 (0x01000054):
> global device timer timeout, initiating device r
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.85:
> Command aborted by host adapter: HA status 0x4:
> f:0025a800:0400.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.85:
> request failed after try #1: cdb 0x2a:09b25098
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.85:
> Command aborted by host adapter: HA status 0x4:
> 8:109ed500:0040.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.85:
> Command aborted by host adapter: HA status 0x4:
> a:09b25218:0080.
> Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.83 (0x01000053):
> global device timer timeout, initiating device r
> Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.84:
> request failed after try #1: cdb 0x2f:0025a800
> Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.82 (0x01000052):
> global device timer timeout, initiating device r
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.83:
> Command aborted by host adapter: HA status 0x4:
> a:09b25298:0080.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.83:
> request failed after try #1: cdb 0x2a:09b25198
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.83:
> Command aborted by host adapter: HA status 0x4:
> f:0025ac00:0400.
> Fri Feb 4 12:16:52 GMT [ispfc_timeout_1:error]: 8b.81 (0x01000051):
> global device timer timeout, initiating device r
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.82:
> Command aborted by host adapter: HA status 0x4:
> f:0025ac00:0400.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.82:
> Command aborted by host adapter: HA status 0x4:
> a:09b25398:0080.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.82:
> Command aborted by host adapter: HA status 0x4:
> a:09b25418:0080.
> Fri Feb 4 12:16:52 GMT [scsi.cmd.abortedByHost:error]: Device 8b.81:
> Command aborted by host adapter: HA status 0x4:
> f:0025ac00:0400.
> Fri Feb 4 12:16:54 GMT [ispfc_timeout_1:warning]: 8b.80 (0x01000050)
> (0x034f58b0,0x1c,0/0,56193/0/0,8745/0): command
> , quiescing drive to allow outstanding I/O to complete.
> Fri Feb 4 12:17:00 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.80:
> request failed after try #1: cdb 0x1c.
> Fri Feb 4 12:17:08 GMT [scsi.cmd.retrySuccess:info]: Device 8b.85:
> request successful after retry #1: cdb 0x28:109ed
> .
> Fri Feb 4 12:17:25 GMT [scsi.cmd.retrySuccess:info]: Device 8b.82:
> request successful after retry #1: cdb 0x2a:09b25
> .
> Fri Feb 4 12:17:25 GMT [scsi.cmd.pastTimeToLive:error]: Device 8b.89:
> request failed after try #1: cdb 0x28:10bc5c38
> Fri Feb 4 12:17:25 GMT [scsi.cmd.checkCondition:error]: Device 8b.83:
> Check Condition: CDB 0x2a:09b25298:0080: Sense
> SI:aborted command - Fibre Channel frame CRC error (0xb - 0x47 0x0
> 0x3)(42100).
> F
> PANIC: raid volfsm: vol boot: fatal multi-disk error. in process
> config_thread on release NetApp Release 6.4.5 on Fri
> 12:17:25 GMT 2005
>