Mr. Benway goes into some detail regarding disk IO based on rotation speeds and access times:
"...The component numbers work out to be 3ms average rotational delay (10,000 rpm), random mode access times of 5.2ms (read mode) or 6ms (write mode, more precise alignment required), and a minimum track-to-track seek time of .6ms (reads) or .9ms (writes). On a filer, where reads usually far outnumber the writes, use 5.2ms for the random mode calculation, and .6ms for the sequential. This will get you close enough. Hence, for random reads, the aggregate read access time is 8.2ms, yielding 122 IOPS (1,000ms / 8.2ms). For sequential mode, the aggregate read access time will be 3.6ms, yielding 278 IOPS (1,000ms / 3.9ms)."
Earlier in the article he used these disk figures (122 and 278 IOPS) to ascertain potential total system disk IOPS for the configurations posted at the SPEC website by NetApp. System 1 (his nomenclature) had 16 data disks, and therefore 16x122 = 1952 IOPS for random reads or 16x278 = 4448 IOPS for sequential. System 4 had 34 data disks for 4148 and 9452 IOPS. Somehow, Mr. Benway makes the leap to use these data points as a basis for comparison between EMC and NetApp. Since the IP4700 configurations posted on the SPEC website by EMC have many more disks (90) than either NetApp config, he suggests the paucity of RAM available to the EMC IP4700 (1.5GB) accounts for the difference:
"...So why does System 5 (the EMC IP4700) falter in comparison to {NetApp}? ...EMC has only half of the useable RAM as available to the F840. Hence, there isn't enough RAM (work space) on the EMC to take advantage of its expected higher disk throughput."
The EMC cluster has 500% of the disk IOPS potential (90/18 = 5) but only 30% higher throughput (11,451/8820 = 1.3*; OPS/sec statistics from his Figure 1 excerpt of SPECsfs97 results) than a single filer, and he explains this away with memory? Should one assume that if the EMC IP 4700 had the same memory as the F840 (3GB) that it would serve data 500% faster than the NetApp?! *Note: Used Systems 1 and 5 for comparison to have a common protocol for the benchmark (UDP-3).
Only after making this dubious claim that hardware primarily accounts for the performance differences does he suggest that software may have something to do with the performance disparity (in very vague terms):
"...You would also have to know more about the EMC operating system and file system efficiency versus that of the NetApp in order to come to any further conclusions."
BINGO! There's your answer, Mr. Benway. NetApp is a software company, not a hardware company. You referred to NetApp's whitepaper TR3002 in your article, but I think you would have done well to check TR3001 first (www.netapp.com/tech_library/3001.html). Particularly the sections on RAID and system performance/disk bottlenecks:
"...the FFS (the Berkeley Fast File System) was designed to optimize writes for one file at a time. As a result, it typically writes blocks for different files to widely separated locations on disk... Since the Berkeley FFS doesn't understand the underlying RAID 4 layout, it tends to generate requests that are scattered over the data disks, causing the parity disk to seek excessively. WAFL writes blocks in a pattern designed to minimize seeks on the parity disk."
This excerpt from TR3001 only addresses how file data is laid out on disks. It does not even touch on other disk seek activity required by other file systems. Most of us have probably seen the PowerPoint slide that depicts the drastic reduction of disk seeks required to retrieve or write data because of the way WAFL manages disk space when compared to other file systems, including the CrosStor file system implemented by EMC in their IP4700 ( ? please correct me if I'm wrong there). NetApp filer performance is driven by the software, and the hardware is designed to deliver that performance through best value off-the-shelf configurations (as Mr. Benway noted, it is not critical to have the fasted CPU available in the market today to deliver this performance).
I readily admit that I am strongly biased toward NetApp, and I hope this doesn't seem to be EMC-bashing. If I have something wrong here, I would like to know. My customers count on me to stay ontop of the storage market to make appropriate recommendations to address their requirements. Right now with regard to performance NetApp has a unique advantage with Data ONTAP and WAFL, as I see it. Of course, performance is just one issue, and many times a minor one in determining what is the best storage solution for a given requirement.
Just my $.02. Thanks for providing a forum for me to ramble. :-)
Joe
Joe Luchtenberg Dataline, Inc. 757.858.0600 757.285.1223 (cell) 757.858.0606 (fax) joe.luchtenberg@data-line.com www.data-line.com
-----Original Message----- From: Todd C. Merrill [mailto:tmerrill@mathworks.com] Sent: Friday, March 23, 2001 6:01 PM To: toasters@mathworks.com; emcnas@mathworks.com Subject: NAS Wars
A recent article in Server/Workstation Expert, entitled "NAS Wars," might be of interest here:
http://swexpert.com/CB/SE.C11.MAR.01.pdf
I applaud Alan's efforts at applying a bit of math and a lot of common sense to such a sticky issue.
This isn't meant to start a flame war, though something I read as a kid on a package of firecrackers comes to mind: "Light fuse, and run away fast!" ;)
Personally, I would suggest reading the first three sentences of the conclusion first, and with that perspective, read the rest of the article. Read that way, the methods and assumptions and calculations and configurations make for good detective work, IMHO.
Until next time...
The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---