On Tue, 1 Nov 2011, Jeff Mohler wrote:
"Part of the issue here may be theoretical vs. real world. After all, in theory 'theory' and 'practice' are the same, but in practice they aren't."
But..they -are-.
Not in my experience, but certainly your mileage may vary.
I made those statements, and I will stand by them as a relatively knowledgeably person on these matters.
Build yourself a system in your lab (whomever is able) and watch via statit the distribution of IO across the aggregate FAR above the raid-group level.
20+2 will be within all reasonable margins of 2(10+2), or 4(5+2).
Ah, but on this I agree, and have all along. I don't think there's going to be a large difference between an aggr made of 24 disk rgs, 16 disk rgs, or 8 disk rgs.
However, if you *mix* rg sizes within the aggr, you are causing the filer to do more work.
"The other problem is that moving forward if you have optimized your writes for the 16 disk rg stripe size, some rgs are going to be optimized and some aren't, which could conceivably affect performance."
Im trying to understand this statement as well. Writes are optimized for the full stripe of the aggregate. RG size, again, has nothing to do with it as a "stripe".
Raid groups are zones of data protection.
Yeas, rgs are zones of data protection (really they are spindle protection in a NetApp -- basically the same thing, but again, theory vs. practice).
However, as Davin pointed out, there are different stripes in play here. RAID-4 and RAID-DP, by their nature, have stripes. They have to in order to calculate the parity data. So each rg has a stripe width.
An aggregate is, at the most basic level, just raid-0. We usually use the term "wide stripe" to talk about a stripe of stripes. We've created protection at the raid group level, and now we're gluing those raid groups together to take advantage of better i/o in a large number of spindles. That's why I create as large an aggregate as I can and let the filer figure out where the hot spots are. It also minimizes wasted space.
Anyway, that wide stripe works best when all of the calculations are the same. However, if you use different-sized raid groups within the aggregate, the filer has to do more work. It may be minimal work, and the filer can handle it, but these things start to add up if you're looking for performance. It may be that a 4 disk rg performs only a few percent slower than a 16 disk rg. Now combine that with a 12 disk and 6 disk rg and add them all to an aggregate where the filer has to do a few percent more calculations to do the wide striping properly...
Man, Randall, you sure ask good questions. :)
-Adam