RE: Flaw with SATA disks - not suitable for deduplication environment

13 Jun 2008

      Folks, the original article refers to "bit errors" on SATA disk
drives creating "problems".. The type of problem that this does
create is typically a Media error. Media errors are like death 
and taxes, they're something that you know is inevitable, it's 
only a question of when.
Five years ago, Netapp realized that arial density was on
the rise and that simply relying on disk access and weekly
RAID scrubs to find and fix media errors was insufficient. 
We actually had a spike in double double disk failures 
occurring due to media related failures. Ironically this was
happening on the Native FC disk subsystems due to media 
substrate defects vs the SATA drive environments which were 
just not on-scene yet..
With the 6.4.2 release of Data ONTAP, we created a new 
background media scan feature. This process runs in the
background and insures that media errors are detected and
fixed by RAID prior to escallating to a situation where
double disk failures could occur.
Over the releases that have followed, this core piece of
Data ONTAP has been modified and updated to keep pace with
larger capacity disk introductions. Background Media Scan
together with the advent of RAID DP and on disk checksums 
effectively provides industry leading data integrity in 
this area.
Hope this helps,
- Doug
Doug Coatney
Senior Software Engineer
Storage Systems Team
NetApp
408.822.3708 Direct
408.822.4579 Fax
dougc@netapp.com
www.netapp.com
...
-----Original Message-----
From: Jan-Pieter Cornet [mailto:johnpc@xs4all.nl] 
Sent: Thursday, June 05, 2008 5:59 AM
To: David Lee Lambert
Cc: toasters@mathworks.com
Subject: Re: Flaw with SATA disks - not suitable for 
deduplication environment
On Thu, Jun 05, 2008 at 08:16:26AM -0400, David Lee Lambert wrote:
...
I disagree with the statement that "RAID technology [...] does not 
detect if a specific bit on a SATA drive becomes unreadable."  If a 
bit is unreadable or reads back as the wrong value, RAID-5 (or 
NetApp's RAID-DP) can detect and fix the error when the data
is read back.
*bzzt* nope. RAID by itself does not "detect" bad data. It can 
correct bad data when it's detected (by other means, usually 
the hardware driver), but it doesn't detect it.
At least, that's not how it's implemented in all RAID systems 
that I'm aware of. To _detect_ bad data with RAID, you'd have 
to read an entire track, and verify that the checksum is 
correct. If it isn't, then AT LEAST ONE of the sectors is in 
error, but it's impossible to determine which one, without 
additional information.
RAID-DP might distill that info from the diagonal checksums 
(if it's a single sector error), but you can hardly expect 
your super fast networked storage hardware device to go and 
solve crossword puzzles for you every time you request a block of data.
RAID works because drives don't just flip bits, they fail. Or 
the CRC on the drive block fails. In any way, there's some 
indication that something is amiss on a certain drive. RAID 
then offers the ability to replace the failed data from the 
other drives, using the parity.
...
(Note that if the data is never read back,  then it's immaterial 
whether it's
correct.)
Deduplication does not increase this risk.  In fact, deduplication 
means that the duplicated data is read back more often,
which should
...
mean that any errors that occur would be detected sooner.
On Thursday 05 June 2008 06:54:15 am Chong, Jenson wrote:
...
Hi,
Anybody have insights on this article? Does our ONTAP "Lost Write 
Protection" address this issue?
Regards,
Jenson
A bit of a flaw with SATA disk drives

June 03, 2008
By Jerome Wendt
Network World Asia

High-capacity serial ATA (SATA) disk drives are 

now a mainstay in
...
...
many storage systems and make it feasible for almost any
company to
...
...
obtain a storage system with terabytes of capacity at a reasonable 
cost. Yet these systems reveal a specific, known
deficiency of SATA
...
...
disk drives that demand companies exercise caution as to what 
environments they deploy these systems into.
A minor flaw with SATA disk drives that high 

capacity storage
...
...
systems expose is their bit error rate. Bit errors occur 
infrequently - about once for every 100 trillion bits.
However RAID
...
...
technology, which is normally used by storage systems to protect 
against data loss, does not detect if a specific bit on a
SATA drive
...
...
becomes unreadable.
While this is normally not a problem on smaller 

systems, as
...
...
storage systems add more capacity, the issue becomes more
acute. On
...
...
systems with more than 10TB of capacity the probability of a 
specific bit of data becoming unreadable is a distinct
possibility.
...
...
On systems with over 100TB, it becomes almost a certainty.
So the question becomes: Does losing access to 

one bit of data
...
...
really matter? Often, it doesn't unless one stores
deduplicated data
...
...
on these systems which is now the fastest growing trend in data 
storage. When data is deduplicated, the storage system's need to 
read every bit of data becomes paramount. The inability to access 
even a bit of data can result in multiple files becoming
unreadable
...
...
since they all may depend on a specific bit of data to
complete their reconstruction.
...
...
High capacity SATA-based storage systems are 

the answer to many
...
...
companies' archiving and backup problems. But SATA bits
can bite and
...
...
using SATA drives to store large amounts of deduplicated
data is not
...
...
always the match made in heaven that vendors make them out to be.
Jerome Wendt is the president and lead analyst 

at DCIG Inc. You
...
...
may read his blogs at www.dciginc.com http://www.dciginc.com/ .
--
David L. Lambert
  Software Developer,  Precision Motor Transport Group, LLC
  Work phone 517-349-3011 x215
  Cell phone 586-873-8813
-- 
Jan-Pieter Cornet johnpc@xs4all.nl
!! Disclamer: The addressee of this email is not the intended 
recipient. !!
!! This is only a test of the echelon and data retention 
systems. Please !!
!! archive this message indefinitely to allow verification of 
the logs.  !!

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

RE: Flaw with SATA disks - not suitable for deduplication environment