Hi Toasters,
I found this post interesting so I polled some of our Engineering folks and this was the consensus opinion:
This is not really a problem for any modern storage system. It is true that errors can develop over time - or even during the manufacture of disk drives. Every drive has built-in error correction codes that detect, and usually correct such a bit error. If the string of errors is too great to be handled by the ECC codes, the drive will report back that the sector is unreadable, at which point RAID algorithms will "fix" the error from the information stored on other sectors . Our RAID-DP means two drives can even have the same data in error, and we can still recover. Our SATA drives also use a checksum scheme for further protection - we utilize an additional portion of the drive as overhead to store checksums that move with the data through the system to insure what was written is returned to the application. In essence a third level of protection. This same type of protection against bits in error covers our FC drive systems. Mr Wendt is correct in indicating that dedupe (aka block sharing) could exacerbate any data corruption event. We take this seriously and believe that the extra steps NetApp has taken create one of the most reliable deduplication architectures in the industry.
Regards,
Larry Freeman NetApp
From: "Karlsson Ulf Ibrahim :ULK" ulk@forsmark.vattenfall.se Date: June 5, 2008 4:42:22 AM PDT To: toasters@mathworks.com Subject: SV: Flaw with SATA disks - not suitable for deduplication environment NetApp's Raid-DP is the answer you're looking for. /Ulwur
-----Ursprungligt meddelande----- Från: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] För Chong, Jenson Skickat: den 5 juni 2008 12:54 Till: toasters@mathworks.com Ämne: Flaw with SATA disks - not suitable for deduplication environment Hi, Anybody have insights on this article? Does our ONTAP "Lost Write Protection" address this issue? Regards, Jenson
A bit of a flaw with SATA disk drives
June 03, 2008 By Jerome Wendt Network World Asia
High-capacity serial ATA (SATA) disk drives are now a mainstay in many storage systems and make it feasible for almost any company to obtain a storage system with terabytes of capacity at a reasonable cost. Yet these systems reveal a specific, known deficiency of SATA disk drives that demand companies exercise caution as to what environments they deploy these systems into.
A minor flaw with SATA disk drives that high capacity storage systems expose is their bit error rate. Bit errors occur infrequently - about once for every 100 trillion bits. However RAID technology, which is normally used by storage systems to protect against data loss, does not detect if a specific bit on a SATA drive becomes unreadable.
While this is normally not a problem on smaller systems, as storage systems add more capacity, the issue becomes more acute. On systems with more than 10TB of capacity the probability of a specific bit of data becoming unreadable is a distinct possibility. On systems with over 100TB, it becomes almost a certainty.
So the question becomes: Does losing access to one bit of data really matter? Often, it doesn't unless one stores deduplicated data on these systems which is now the fastest growing trend in data storage. When data is deduplicated, the storage system's need to read every bit of data becomes paramount. The inability to access even a bit of data can result in multiple files becoming unreadable since they all may depend on a specific bit of data to complete their reconstruction.
High capacity SATA-based storage systems are the answer to many companies' archiving and backup problems. But SATA bits can bite and using SATA drives to store large amounts of deduplicated data is not always the match made in heaven that vendors make them out to be.
Jerome Wendt is the president and lead analyst at DCIG Inc. You may read his blogs at www.dciginc.com http://www.dciginc.com/ .