On Wed 23 Dec, 1998, sirbruce@ix.netcom.com wrote:
The only other thing I haven't seen mentioned is number of disks. This can be a limiting factor in performance, particularly towards the lower end. Having more disks that you need will not make you appreciably faster, but having too few will be very noticeable.
Oh too true, especially as data on disks is becoming relatively less accessible in high-throughput applications. Brian Wong's book has a lucid explanation for why more spindles of lower capacity can make a hell of a lot more sense than fewer, bigger disks.
In current filers, I don't know what the right ratio would be. I used to use about 100-200ops/disk as a rule of thumb, but of course it depends on the kid of ops. If you have an array of 21-28 disks, then I doubt that's the limiting factor. If you only have 14, you might consider adding some more to the filesystem and seeing if that improves your write throughput. Adding more NVRAM is of course good if possible for your filer.
I'd speculate that the new multiple RAID group feature in DOT5 also has an important performance dimension, besides the reliability aspects.
For writes, and degraded reads, the fewer disks in a RAID group the higher the performance I'd expect in the RAID calculation.
Now, assuming a volume has a number of RAID groups, I'd presume the WAFL filesystem will still consume blocks over all the RAID groups and make attempts to write whole stripes into each RAID group, thereby gaining the fullest bandwidth available to the platters, given the calculations that have to be done on the data.
One interesting thing to note is how much the filer needs to read from disk while performing write-intensive work. If the filer isn't reading much from disk at all, then you're laying down fresh data stright from memory and the machine will be running at full tilt.
If there's quite a lot of reading going on then the machine is having to work harder to lay the data down on the platters. This is more likely to happen when the volume is on the full side, has been running with some degree of churn for a while and contains relatively small files. Ie old data being removed to be replaced by new data - as in usenet news spools.
So, if I was hoping for high performance for a news spool on a filer I'd be thinking of a 760 with many 9GB disks (the smallest available I believe) in an unclustered configuration. I'd also play with the RAID group sizes until I'd struck a rough balance between the costs, the performance and the resilience. I'd use Gb-ether into an Extreme switch. I'd toy with using the no-atime option, perhaps play with both NFS2 and NFS3 to see which gave best performance. And I'd use different hosts for spooling the news, and for serving the news...but that's application level stuff and far less easy to work on and comment on.
-- End of excerpt from sirbruce@ix.netcom.com