----- Original Message ----- From: mark mds@gbnet.net To: toasters@mathworks.com Sent: Wednesday, February 23, 2000 5:04 PM Subject: Re: Disk drive technology
On Wed 23 Feb, 2000, "Bruce Sterling Woodcock" sirbruce@ix.netcom.com
wrote:
The article doesn't say, but bigger, faster disk drives often draw more power. Thus, NetApp needs to have shelves that can support them first, before they can test them for reliability. There are a lot of
issues
involving mechanical vibration and so on that have to be assessed before such a drive could be qualified. It may be that this particular drive is not appropriate for certain shelves.
Well, yeah, so much so obvious. So how soon was that? 8)
Probably not until late this year or next.
More spindles is better performance, if you are read-heavy. If your
filer
seems slow and it's reading constantly but the CPU is still under 100%, more spindles could help.
So much for conventional wisdom. However, my question was trying to elicit something a little more, well, complete as a model of performance.
The complete model depends entirely on your filer model, the OS version, exactly what disks you have, how much memory, what your op mix is, etc.
I mean FCAL is a loop technology, and filers can come with 1, 2 or 3 loops. How much does performance differ when using 1 chain as opposed to 3 balanced chains, using which drives, using what RAID set sizes and volume mappings? How does performance get affected by the size of the disks you use?
See, if none of these things makes any difference to performance I'd love to know why. If they *do* then I'd love to know how to build faster filers.
The make very little difference. On the order of 10% for most people. As to WHY, it's because Netapp finely balances their filers for maximum performance at typical op mixes. They are not like a competitor, which throws a bunch of hardware at a problem and you are expected to juggle it to figure out what works. Netapp wants to deliver a simple solution.
There are some performance papers on Netapp's web site that address some of your questions, and people are free to post their personal experiences here, but what works for one environment may not work for another. I would be reluctant to tell you to go out and buy more drives you don't need if it won't help you.
I'd be happy with an analytical model, or a set of tables taken from either real lab testing with real filers, or simulator results. As it stands though, all we have are rules of thumb that sound plausible.
Given that those tables constantly change with the variables (drives, memory, OS, etc.) rules of thumb is probably the best you can do, unless you have a few million to spend on testing every possible configuration and reporting the results. :)
The same goes for backup performance. There's been so much discussion on the list about both topics, but I can't honestly hand-on-heart tell someone how best to set-up their filers. I'd write something if I knew the answers myself.
There is no best. There is only what is best for your environment.
I've been meaning to brush up on queueing theory so maybe I *will* try to work something up myself on this. I'd just hate to reinvent the wheel if someone's done it already. I also don't know anything about the buffers in the filer, on FCAL interfaces, the network interfaces, the precise use of the WAFL, etc.. So my model might be under-informed.
Here's a thought to mull over: if you have 20 18 gig drives on 2 loops and 16MB nvram, will you see better or worse performance than with 40 9GB drives on 1 loop with 16MB nvram presuming your clients are all writing as fast as they are allowed?
I would guess slightly better, since the 18 gig drives are probably faster and you have 2 loops to share the load. However, since the bottleneck will be the NVRAM, I doubt there will be a major noticeable difference.
Please disregard any "such configs aren't shipped" objections as this is an in-principle hypothetical. Please do consider the effect of RAID sets, and incremental disk-additions over the lifetime of the filer in question.
I can't consider them if you don't say what they are.
I picked writing as the task because I wanted to point out also that there's an impact on the DOT algorithms, based on the spindles, the RAID set sizes, and maybe the chains too.
Agreed.
I don't know the answer, but I'd like to. Without such models we all have to purchase filers somewhat in the dark, and either pray that we got it right, or expend more time than we could in evaluating our purchases for suitability.
You can't know if you got it right unless you have an exact simulation of your particular ops mix and traffic patterns. Since you don't have that, then basically you have to use rules of thumb and guess and adjust when needed. Your environment changes over time, too, so what works one day may not work another.
One question - is anyone else interested in this or am I just shooting my mouth off needlessly?
Maybe it's just me, but while I love having numbers and data, I've always found such tuning tasks to be far more intuitive and situational. I mean, if I have a chart that says such-and-such filer with such-and-such configuration can handle 200 users, and the filer in my environment overloads at 100 users, I'm not gonna keep adding users. Conversely, if it runs at 50% and high cache age with 200 users, I'm not going to worry about adding more.
If a filer is overloaded with reads, and I haven't maxxed the RAM yet, I'm probably going to max the RAM first. Then I'll worry about adding more disks, and if that doesn't help, I'll know I need to reduce traffic and get another filer. There are some minor variations to this, of course, but I'm not going to waste a lot of time beforehand trying to predict exactly how many disks and how much memory I need when the reality can quickly change. Estimate, yes, but I won't follow a strict chart.
Bruce