Well, from what I gather in searching for TR-3838, it’s not publicly available, so no, I haven’t even heard of it until now 8-)

 

I’d love to see it, though.

 

AFAIK, the closest the main documentation gets to this subject is in the Storage Management Guide (p. 108 in the 7.3 docs) where it says:

 

Large RAID group configurations offer the following advantages:

• More data drives available. An aggregate configured into a few large RAID groups requires fewer

drives reserved for parity than that same aggregate configured into many small RAID groups.

• Small improvement in storage system performance. Write operations are generally faster with

larger RAID groups than with smaller RAID groups.

 

Small RAID group configurations offer the following advantages:

• Shorter disk reconstruction times. In case of disk failure within a small RAID group, data

reconstruction time is usually shorter than it would be within a large RAID group.

• Decreased risk of data loss due to multiple disk failures. The probability of data loss through

double-disk failure within a RAID4 group or through triple-disk failure within a RAID-DP group

is lower within a small RAID group than within a large RAID group.

 

But there’s not even a hint of performance data.

 

Poking around now a little more, I see that a portion of TR-3838 that includes the text you quoted (“The previous approach to RAID group and aggregate sizing…”) does appear in a long NetApp Community post on this topic (at http://communities.netapp.com/thread/1587 ) near the end. But I’m not sure what was posted includes all the pertinent info from TR-3838. I did recently read that whole thread by the way and, as you noted, it doesn’t have a definitive statement or definitive performance data.

 

Thanks for the tip, though. I’m another step closer now 8-)

R

 

 

From: Eugene Vilensky [mailto:evilensky@gmail.com]
Sent: Sunday, October 30, 2011 6:27 PM
To: Cotton, Randall
Cc: toasters@teaparty.net
Subject: Re: Is there a graph somewhere of performance vs raid group size?

 

Randall,

 

have you seen TR-3838, the Storage Subsystem Configuration Guide?  It doesn't have a definitive statement, but it does have a section that starts with the following caveats and describes some reasons for creating larger RAID groups than the prior default:

 

"4.6 RAID GROUP SIZING

The previous approach to RAID group and aggregate sizing was to use the default RAID group size. This no longer applies, because the breadth of storage configurations being addressed by NetApp products is more comprehensive than it was when the original sizing approach was determined."  

Cheers,

Eugene

 

On Sun, Oct 30, 2011 at 4:23 PM, Cotton, Randall <recotton@uif.uillinois.edu> wrote:

We’re a relatively small shop just getting started with NetApp (1 2020 and 2 2040’s all active/active, plus 3 4342’s for the 2040’s ).

 

Since committing a disk to a raid group via an aggregate is essentially a permanent thing (you can’t change your mind and later shrink an aggregate to pull out a disk from a raid group to use with another node), we’d prefer not to put all our (80) disks in aggregates just yet. We might want more in some nodes and less in other nodes as future needs come into clearer focus. In addition, it’s clear that we can later easily add in any disks we’ve held back as needed and use the reallocate command with the –f option to re-optimize the layout to accommodate the added disks efficiently. So it seems a bit short-sighted to configure all 80 of our disks into aggregates among our 6 nodes right from the get-go (minus the requisite hot spares, of course), as our vendor would have us do.

 

But in deciding how small to start out with, we don’t want to cripple our performance too much. I’ve looked long and hard on the net to find some data, any data, on DOT 7.x performance vs raid group size, but have come up empty. I understand that performance should be awful and unacceptable if you have a raid group of size 3. I also understand from anecdotal evidence that performance improvements from higher raid group sizes are apparently are not significant once you get to a raid group size of 16 or so.

 

But what about in between? How does a graph of performance vs raid size look from 3 to, say, 20? Just ballpark data on any type of remotely typical workload would help a lot to start with. Has anyone ever seen or tried to compile this kind of data? Using iometer, perhaps, or any other benchmarking tool? RAID-DP data is preferable, but I’d take RAID-4 data if that’s all I can get.

 

Don’t really have the time to do testing myself.

 

Thanks

Randall Cotton

University of Illinois Foundation


_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters