250K NFS ops/sec, eh?

List overview All Threads
Download

newer

older

Re: Finding who is pounding your...

RE: Exchange 2k (long)

Brian Tao

7 Nov 2000 7 Nov '00

12:49 a.m.

http://www.spec.org/osg/sfs97/results/res2000q3/sfs97-20000905-00200.html

What's special about a "scalable storage cluster", or is that just a new marketing name given to a bunch of independent F840's? A quarter million ops per second... sweet. :)

-- Brian Tao (BT300, taob@risc.org) "Though this be madness, yet there is method in't"

Show replies by date

Todd C. Merrill

7 Nov 7 Nov

4:27 p.m.

New subject: 250K NFS ops/sec, eh? (warning...long and ranting)

On Mon, 6 Nov 2000, Brian Tao wrote:

...

http://www.spec.org/osg/sfs97/results/res2000q3/sfs97-20000905-00200.html
What's special about a "scalable storage cluster", or is that just
a new marketing name given to a bunch of independent F840's? A quarter million ops per second... sweet. :)

Clueless customer: "Wow, that sucks compared with EMC!"

EMC could then create their own "scalable storage cluster" by latching 16 Celerra's together to get ~1.6 M ops/s throughput.

http://www.spec.org/osg/sfs97/results/res2000q3/sfs97-20000711-00180.html

Clueless customer: "Wow, that sucks compared with NetApp!"

NetApp and EMC have to stop staging their own arms race with these SPEC numbers. The Celerra was designed to hold 14 datamovers (N+1 failover) and it is fair to test it as such. Filers were designed to be clustered in pairs, and it is fair to test them as such. It is not fair, in my opinion, to test "clusters" by stringing N of each of them together. The limit on scaling in such scenarios then becomes effectively, infinite, which is meaningless.

EMC gained over 100,000 ops/s with their Celerra the hard way: NFS v3 over TCP. NetApp gained their numbers the easy way: NFS v2 over UDP. My plea to NetApp is: please publish NFS v3 over TCP numbers, so we customers can make a fair comparison. You used to do this with the F760s. Or, conversely, to EMC: please publish NFS v2 over UDP. Then we can see who has the bigger di...<nevermind>. [2]

Anyone who takes these benchmark numbers at face value is a fool and deserves to have their money taken away from them by either vendor, for not doing their homework.

My challenge to BOTH vendors is: in addition to "maximum throughput" configurations and numbers, please publish SPEC numbers in REAL-LIFE configurations, configurations that customers actually use. [2,3] (Note the plurality.)

In the meantime, stop the foolishness, boys.

[unreferenced footnote] Sorry, folks, for the rant, but this really pushes my buttons. As some of you may know, The Mathworks now has NetApp filers *and* an EMC Celerra/Symmetrix. We went through all this number bullshit for months with both vendors [1], so I hate to see this foolishness again. When you dive into the numbers, both vendors' CPU units (filer heads or datamovers) are "comparable." Sometimes one is a bit ahead of the other, sometimes vice versa. But, they are approximately the same when you are able to sift through the numbers and compare apples to apples. If there were a clear winner in the strict ops/s game, then everyone would buy from that vendor if all they needed were ops/s. Luckily for us customers, there is healthy competition, which gives us what we need: better performance year after year. And, one more thing: good buying decisions are rarely as one-sided as choosing one vendor for one specification. Look at the whole picture: performance, scalability, reliability, ease of use, service, in-house knowledge/experience, etc.

[1] With the numbers published so far, for F760s, for instance, we can see there is approximately and conservatively a 45% scaling factor between NetApp's NFS v2 over UDP versus their NFS v3 over TCP numbers. Assuming that ratio approximately translates to the F840s (ONTAP and WAFL are the same for 5.3.x, for instance), this magic 16-node filer cluster has 250,000 ops/s NFS v2 over TCP, which I figure is about 112,500 ops/s NFS v3 over TCP. That's about the same as EMC's Celerra with 14 datamovers. 14 datamovers or 16 filer heads...about the same performance within 10-15%.

[2] And, to preempt the inevitable questions, yes, the Celerra is not running in mirrored mode like most people would, and yes, it has more than one Symmetrix behind it. And, yes, the NetApp disables snapshots, and, yes, the filer has checksum blocks off, and yes, they minimize read-ahead (default values are all the opposite). The devil is in the footnotes...

[3] EMC: How about a mirrored configuration on one Symmetrix with, say, 8 datamovers, one being an active failover? NetApp: How about an out-of-the-box default clustered pair configuration? To both: How about NFS v3 over TCP (hard) and NFS v2 over UDP (easy), to see the *range* of your respective boxes?

Until next time...

The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---

Bruce Sterling Woodcock

5:32 p.m.

New subject: 250K NFS ops/sec, eh? (warning...long and ranting)

----- Original Message ----- From: "Todd C. Merrill" tmerrill@mathworks.com To: toasters@mathworks.com Sent: Tuesday, November 07, 2000 8:27 AM Subject: Re: 250K NFS ops/sec, eh? (warning...long and ranting)

...

NetApp and EMC have to stop staging their own arms race with these SPEC numbers. The Celerra was designed to hold 14 datamovers (N+1 failover) and it is fair to test it as such. Filers were designed to be clustered in pairs, and it is fair to test them as such. It is not fair, in my opinion, to test "clusters" by stringing N of each of them together. The limit on scaling in such scenarios then becomes effectively, infinite, which is meaningless.

I disagree. The performance numbers have nothing to do with failover functionality, so there's no reason to limit "one configuration" to one that has failover. That logic would rule out a lot of single-server configurations entirely.

By the same token, that fact you can stick 14 datamovers in one big box is also irrelevant. What matters is what performance you can actually get out of a given configuration, not what you can fit in one box.

I agree, though, that it becomes silly since one can effectively keep adding more and more servers and scale upward. So how does one *really* compare? What you need to do is look "under the numbers" and compare how many ops each one does with a given amount of hardware. And then figure out $. The SFS97 test doesn't give this, but I hope in the future it *will* figure ops/$ (at list) and ops/disk and so on.

...

EMC gained over 100,000 ops/s with their Celerra the hard way: NFS v3 over TCP. NetApp gained their numbers the easy way: NFS v2 over UDP. My plea to NetApp is: please publish NFS v3 over TCP numbers, so we customers can make a fair comparison. You used to do this with the F760s. Or, conversely, to EMC: please publish NFS v2 over UDP. Then we can see who has the bigger di...<nevermind>. [2]

I agree that full data needs to be made available on both, and I'm sure we'll get there. However, knowing NTAP's "scaling factor" between v2 and v3 and UDP and TCP in the past, you can make a reasonable guess at NTAP's performance.

...

Anyone who takes these benchmark numbers at face value is a fool and deserves to have their money taken away from them by either vendor, for not doing their homework.

My challenge to BOTH vendors is: in addition to "maximum throughput" configurations and numbers, please publish SPEC numbers in REAL-LIFE configurations, configurations that customers actually use. [2,3] (Note the plurality.)

The F760 configuration is certainly a real life configuration. The Celerra configuration, however, is not. The monster would cost $20M list; even with heavy discounts you're talking $5-$10M, for a configuration without RAID protection, with tons of "wasted" disk space.

...

In the meantime, stop the foolishness, boys.

[unreferenced footnote] Sorry, folks, for the rant, but this really pushes my buttons. As some of you may know, The Mathworks now has NetApp filers *and* an EMC Celerra/Symmetrix. We went through all this number bullshit for months with both vendors [1], so I hate to see this foolishness again. When you dive into the numbers, both vendors' CPU units (filer heads or datamovers) are "comparable." Sometimes one is a bit ahead of the other, sometimes vice versa. But, they are approximately the same when you are able to sift through the numbers and compare apples to apples.

Umm, I think you've been mislead by EMC then. The do *not* have the same number of CPUs. You see, the Celerra DMs all have one CPU each, but the configuration quoted by EMC uses 6 Symmetrix Model 8430 frames, and each Symmetrix has 6 Channel Directors, and each Channel Director has 2 CPUs. These Channel Directors move and cache disk blocks the same way NTAP's CPU does. They are *not* SCSI cards... EMC has Disk/Storage Directors for that.

Anyway, if number of heads still bothers you, just remember this was the F760. With the F840, the number of heads needed would be reduced significantly.

...

If there were a clear winner in the strict ops/s game, then everyone would buy from that vendor if all they needed were ops/s. Luckily for us customers, there is healthy competition, which gives us what we need: better performance year after year. And, one more thing: good buying decisions are rarely as one-sided as choosing one vendor for one specification. Look at the whole picture: performance, scalability, reliability, ease of use, service, in-house knowledge/experience, etc.

I agree, but Netapp *is* the clear winnder in the strict ops/s game. The reason everyone doesn't buy Netapp is the other reasons you mention.

...

[1] With the numbers published so far, for F760s, for instance, we can see there is approximately and conservatively a 45% scaling factor between NetApp's NFS v2 over UDP versus their NFS v3 over TCP numbers. Assuming that ratio approximately translates to the F840s (ONTAP and WAFL are the same for 5.3.x, for instance), this magic 16-node filer cluster has 250,000 ops/s NFS v2 over TCP, which I figure is about 112,500 ops/s NFS v3 over TCP. That's about the same as EMC's Celerra with 14 datamovers. 14 datamovers or 16 filer heads...about the same performance within 10-15%.

Again, EMC uses a lot more processors than just 14; it uses 84. Plus it's not using RAID, uses a ton of more disk, etc. And the F840 would take even less CPUs.

...

[2] And, to preempt the inevitable questions, yes, the Celerra is not running in mirrored mode like most people would, and yes, it has more than one Symmetrix behind it. And, yes, the NetApp disables snapshots, and, yes, the filer has checksum blocks off, and yes, they minimize read-ahead (default values are all the opposite). The devil is in the footnotes...

Then why did you claim they were about the same, when the details show they aren't? The NTAP stuff is minor and only accounts for a few % overall and puts NTAP on equal footing with the competition; the EMC stuff is a bunch of extra hardware that *you* have to pay for and which shows their configuration is far less efficient.

...

[3] EMC: How about a mirrored configuration on one Symmetrix with, say, 8 datamovers, one being an active failover? NetApp: How about an out-of-the-box default clustered pair configuration? To both: How about NFS v3 over TCP (hard) and NFS v2 over UDP (easy), to see the *range* of your respective boxes?

Netapp will eventually produce all the different protocol numbers, I'm sure. "Out of the box" configurations I would think are unlikely, since it would mean spending a lot of testing resources on them for very little reward, and could even be confusing. EMC would certainly always quote the numbers that made NTAP look the worst, and not every customer is a savvy as you are to look "under the hood". (And even your looking under the hood seems to have left you with a faulty notion of EMC relative to NTAP.)

Bruce

Todd C. Merrill

7:01 p.m.

New subject: 250K NFS ops/sec, eh? (warning...long and ranting)

On Tue, 7 Nov 2000, Bruce Sterling Woodcock wrote:

...

Umm, I think you've been mislead by EMC then. The do *not* have the same number of CPUs. You see, the Celerra DMs all have one CPU

[...]

...

Anyway, if number of heads still bothers you, just remember this was the F760. With the F840, the number of heads needed would be reduced significantly.

[...]

...

I agree, but Netapp *is* the clear winnder in the strict ops/s game. The reason everyone doesn't buy Netapp is the other reasons you mention.

[...]

EMC has not mislead us, despite their efforts to the contrary (and, NetApp, too, to be equal to both vendors). ;) I have no doubt, and many others would agree as well, that NetApp's architecture is much more elegant than EMC's. What I was trying to emphasize were the "CPU" units that actually push data onto the network. For NetApp, these are filer heads; for EMC, these are datamovers. You are 100% correct that the disk storage is much more complicated, powerful, and expensive on the EMC side, but it is also more robust (built-in monitoring, battery backup, failover, ability to local attach, etc.).

...

Then why did you claim they were about the same, when the details show they aren't? The NTAP stuff is minor and only accounts for a few %

[...]

...

Netapp will eventually produce all the different protocol numbers, I'm sure.

Actually, I didn't view "all results" at the SPEC page; I see NetApp has published some TCP numbers. Hoo-ray!

Let's compare F840 TCP NFS v3 (7,783 ops/s) with Celerra 507 DataMover 2 CPU (they don't list a single node) at 15,723 ops/s. Giving NetApp the benefit of the doubt of perfect linearity, a cluster of F840s would then be 15,556 ops/s. Damn, that's close. And, that's one main point I was trying to make.

One issue we had to deal with with NetApp and EMC is scaling. For small systems, the Celerra is prohibitively expensive. You have to buy a gigantic Symmetrix and a Celerra frame, which is tons of money, and you haven't even put a single DataMover in it yet! Their y-intercept on an x-y graph of $ vs. performance is high. Yet, incremental additions (the slope) of datamovers can be much less expensive than a similarly licensed NetApp head.

With NetApp, to add a redundant (i.e., clustered) head (trying to keep the comparison half-way fair, since EMC has N+1 failover), you have to add them in pairs. Their intercept is zero (no money for no heads and no disk), yet their slope can be very high. Depending on discounts from each vendor (does anyone pay list price?!), you can imagine a situation in which the line for each vendor *crosses*, for some number of datamovers/filers and the same amount of disk, that is, the incremental cost in adding another pair of clustered F840s and some disk will become more expensive overall than just adding another datamover and the same amount of disk.

These "other" issues touch on the complexity of the architecture and the specific application, for us and for each customer. Performance is "about the same" so concentrate on the other issues. Another main point I was trying to make.

...

"Out of the box" configurations I would think are unlikely, since it would mean spending a lot of testing resources on them for very little reward, and could even be confusing.

I think, to most customers less savvy than those on this list, the "big" numbers are confusing and misleading. Can "Volvo" really out-slalom "BMW"? Volvo says they can in their commercials. That's what your average customer is going to hear from marketing, and what both EMC and NetApp are doing with the big numbers. However, it *is* misleading (and illegal in some European countries). Which Volvo? Which BMW? By how much? Internal combustion technology in most commercial cars is roughly the same. What differs is the number of cylinders, turbocharging, intercooling, the ECM chip, etc.

Providing more configurations would allow the customer to find the configuration closest to what they need, and hence get an accurate indication of how well either box would do for them.

And, would it not require less work to grab a filer, install ONTAP, keep the defaults, and run the SPEC program, rather than tweaking it to juice every last op out of the box? (More work, sure, in running the SPEC program a *second* time...)

...

EMC would certainly always quote the numbers that made NTAP look the worst,

And, NTAP has fired the last volley with the 16-node non-failover "cluster." All I'm saying is let's stop the dick waving to see whose is the biggest.

Let's hear some newsworthy news on dual-CPU filer heads (I see that extra plugged socket in there...), what's next in clustering, etc.

Until next time...

The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---

Bruce Sterling Woodcock

7:44 p.m.

New subject: 250K NFS ops/sec, eh? (warning...long and ranting)

----- Original Message ----- From: "Todd C. Merrill" tmerrill@mathworks.com To: toasters@mathworks.com Sent: Tuesday, November 07, 2000 11:01 AM Subject: Re: 250K NFS ops/sec, eh? (warning...long and ranting)

...

What I was trying to emphasize were the "CPU" units that actually push data onto the network. For NetApp, these are filer heads; for EMC, these are datamovers.

The EMC Channel Directors also actually push data.

...

Actually, I didn't view "all results" at the SPEC page; I see NetApp has published some TCP numbers. Hoo-ray!

With that F760 config? I hadn't looked recently.

...

Let's compare F840 TCP NFS v3 (7,783 ops/s) with Celerra 507 DataMover 2 CPU (they don't list a single node) at 15,723 ops/s. Giving NetApp the benefit of the doubt of perfect linearity, a cluster of F840s would then be 15,556 ops/s. Damn, that's close. And, that's one main point I was trying to make.

Except it isn't close... the Celerra 507 probably has Symmetrix on the back end, with more CPUs, more disks, and probably lacking RAID. I'm just guessing here; I haven't looked at the configuration in question but that is typical for EMC.

...

Depending on discounts from each vendor (does anyone pay list price?!), you can imagine a situation in which the line for each vendor *crosses*, for some number of datamovers/filers and the same amount of disk, that is, the incremental cost in adding another pair of clustered F840s and some disk will become more expensive overall than just adding another datamover and the same amount of disk.

Yes, but when you calculate in the up front cost of the Celerra, plus the total cost of ownership, the Netapp solution will always be cheaper. Especially if you are comparing against a Celerra with actual RAID protection.

...

These "other" issues touch on the complexity of the architecture and the specific application, for us and for each customer. Performance is "about the same" so concentrate on the other issues. Another main point I was trying to make.

And the point is wrong. Performance is far superior in the Netapp case for the same $. If "other" issues overrride that, so be it.

...

Providing more configurations would allow the customer to find the configuration closest to what they need, and hence get an accurate indication of how well either box would do for them.

I agree, but providing every possible configuration is probably cost prohibitive. While altruism is a nice virtue, I think from a corporate perspective if configurations A and B are sufficient for most customers to make a sale, testing and publishing configuration C might not be cost-effective.

...

And, would it not require less work to grab a filer, install ONTAP, keep the defaults, and run the SPEC program, rather than tweaking it to juice every last op out of the box? (More work, sure, in running the SPEC program a *second* time...)

Right, you'd have to run it a second time. So, if from this point forward, you *always* run it with the default config, sure you save a little work, but presumably your reduce numbers can cost you sales when all the other vendors *are* tweaking. Having the best number *must* mean something, or else corporations wouldn't spend hard-earned dollars on it.

There is also a benefit for internal evaluation. You want to measure how much better a given software release is from a previous one, or a hardware platform from another, for your own evaluation purposes. If the new code has new features that are not turned off, then you will not have a fair 1:1 comparison to see if you *actually* managed to squeeze more performance out of your new software version. By using a stable baseline that doesn't change as new features are added, you can more accurately measure incremental improvements in performance.

...

...
EMC would certainly always quote the numbers that made NTAP look the worst,

And, NTAP has fired the last volley with the 16-node non-failover "cluster." All I'm saying is let's stop the dick waving to see whose is the biggest.

If NTAP and EMC stop, you don't think someone like SUN wouldn't step in? Sure, maybe you're savvy enough not to buy SUN if they do that, but many customers are not.

I think the best solution is to have future benchmarks provide ratios of ops/$ and ops/disk and so on. Of course, these numbers will not scale linearly, so I suppose there is a risk where each company will "push" the numbers of their low-end configuration more in order to make them look better, and this can also lead to unrealistic numbers. But at least the vendors can't just keep adding more and more servers to make themselves look better.

So, what did you finally decide with Mathworks?

Bruce

Todd C. Merrill

8 Nov 8 Nov

5:11 p.m.

New subject: 250K NFS ops/sec, eh? (warning...long and ranting)

On Tue, 7 Nov 2000, Bruce Sterling Woodcock wrote:

...

So, what did you finally decide with Mathworks?

This year we invested heavily in infrastructure and have bought a single and a clustered pair of F840's and a 3 datamover Celerra with a 3930 Symmetrix.

Some of our pre-sales meetings went like this (not an actual transcript, but humorously embellished and pretty accurate at the core):

EMC: NetApp sucks. Buy our stuff. NetApp: EMC sucks. Buy our stuff.

Us: Okay, gentlemen, please stop with the name calling. Concentrate on your own products' strengths.

EMC: We're the fastest. We're the best. Buy our stuff. NetApp: We're the fastest. We're the best. Buy our stuff.

Us: <sigh>

After months of this, I hope this helps people to understand my itchy trigger finger. ;)

Until next time...

The Mathworks, Inc. 508-647-7000 x7792 3 Apple Hill Drive, Natick, MA 01760-2098 508-647-7001 FAX tmerrill@mathworks.com http://www.mathworks.com ---

9154

Age (days ago)

9155

Last active (days ago)

toasters@lists.teaparty.net

5 comments

3 participants

tags (0)

participants (3)

Brian Tao
Bruce Sterling Woodcock
Todd C. Merrill