How many retries on a disk before you pre-fail? - toasters

4 Aug 2012


      I ran a syslog search for "retry"  on this 3270 head for the last 7 days to get these disk retry messages:
Aug  3 20:55:29 [na04:scsi.cmd.retrySuccess:debug]: Enclosure services device 3a.03.99: request successful after retry #1/#0: cdb 0x3c (1301).
then get a frequency count to find chronic retry disks.
awk '{print $12}' irt-na04.retry | sort | uniq -c | sort -nr 
  15 3a.06.99:
  14 3a.05.99:
  13 3a.03.99:
  11 3a.04.99:
  10 3a.02.99:
  10 3a.01.99:
   9 3c.00.99:
   7 0a.08.99:
   6 0a.07.99:
   4 3a.08.99:
   3 3a.07.99:
   3 0b.00.99:
   3 0a.05.99:
   3 0a.03.99:
   2 0a.06.99:
   2 0a.04.99:
   1 0a.02.99:
   1 0a.01.99:
Q: what level of retries should we look at pre failing a disk and replacing it proactively?
Will the retries cause performance issues if ignored?
thanks,
Fletcher