I ran a syslog search for "retry" on this 3270 head for the last 7 days to get these disk retry messages:
Aug 3 20:55:29 [na04:scsi.cmd.retrySuccess:debug]: Enclosure services device 3a.03.99: request successful after retry #1/#0: cdb 0x3c (1301).
then get a frequency count to find chronic retry disks.
awk '{print $12}' irt-na04.retry | sort | uniq -c | sort -nr 15 3a.06.99: 14 3a.05.99: 13 3a.03.99: 11 3a.04.99: 10 3a.02.99: 10 3a.01.99: 9 3c.00.99: 7 0a.08.99: 6 0a.07.99: 4 3a.08.99: 3 3a.07.99: 3 0b.00.99: 3 0a.05.99: 3 0a.03.99: 2 0a.06.99: 2 0a.04.99: 1 0a.02.99: 1 0a.01.99:
Q: what level of retries should we look at pre failing a disk and replacing it proactively? Will the retries cause performance issues if ignored?
thanks,
Fletcher
Ontap manages that for you, and softly ore fails when specific thresholds are met.
Doesn't need user management.
Retries are normal.
Sent from my iPhone
On Aug 4, 2012, at 9:47 AM, Fletcher Cocquyt fcocquyt@stanford.edu wrote:
I ran a syslog search for "retry" on this 3270 head for the last 7 days to get these disk retry messages:
Aug 3 20:55:29 [na04:scsi.cmd.retrySuccess:debug]: Enclosure services device 3a.03.99: request successful after retry #1/#0: cdb 0x3c (1301).
then get a frequency count to find chronic retry disks.
awk '{print $12}' irt-na04.retry | sort | uniq -c | sort -nr 15 3a.06.99: 14 3a.05.99: 13 3a.03.99: 11 3a.04.99: 10 3a.02.99: 10 3a.01.99: 9 3c.00.99: 7 0a.08.99: 6 0a.07.99: 4 3a.08.99: 3 3a.07.99: 3 0b.00.99: 3 0a.05.99: 3 0a.03.99: 2 0a.06.99: 2 0a.04.99: 1 0a.02.99: 1 0a.01.99:
Q: what level of retries should we look at pre failing a disk and replacing it proactively? Will the retries cause performance issues if ignored?
thanks,
Fletcher
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Fletcher,
I got a batch of Seagate 1 TB SATA drives (288) in DS4243s that do this on a somewhat regular basis. The log file showed they had spun down and it took them a while to spin back up when requested. These are on a NearStore so I don't worry much about the performance aspect. I also have a lot of spares so I just wait until the system fails them. If these are SAS drives on a primary filer I would expect the retries to cause latency issues. I would install the latest disk firmware bundle and see if it helps.
Jeff
On Sat, Aug 4, 2012 at 11:35 AM, Jeff Mother speedtoys.racing@gmail.com wrote:
Ontap manages that for you, and softly ore fails when specific thresholds are met.
Doesn't need user management.
Retries are normal.
Sent from my iPhone
On Aug 4, 2012, at 9:47 AM, Fletcher Cocquyt fcocquyt@stanford.edu wrote:
I ran a syslog search for "retry" on this 3270 head for the last 7 days to get these disk retry messages:
Aug 3 20:55:29 [na04:scsi.cmd.retrySuccess:debug]: Enclosure services device 3a.03.99: request successful after retry #1/#0: cdb 0x3c (1301).
then get a frequency count to find chronic retry disks.
awk '{print $12}' irt-na04.retry | sort | uniq -c | sort -nr 15 3a.06.99: 14 3a.05.99: 13 3a.03.99: 11 3a.04.99: 10 3a.02.99: 10 3a.01.99: 9 3c.00.99: 7 0a.08.99: 6 0a.07.99: 4 3a.08.99: 3 3a.07.99: 3 0b.00.99: 3 0a.05.99: 3 0a.03.99: 2 0a.06.99: 2 0a.04.99: 1 0a.02.99: 1 0a.01.99:
Q: what level of retries should we look at pre failing a disk and replacing it proactively? Will the retries cause performance issues if ignored?
thanks,
Fletcher
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters