But... I've read that NFS writes are much faster in NFS v3 than v2. This, of course, leads me to my question (ta-daa) :
That's news to me, unless you're referring to the ability ot NFSv3 to use larger block sizes. For netnews, that's not likely to be of much benefit since most articles are very small.
...
Quote from 'Configuration and capacity planning for Solaris servers' by Brian Wong :
...
In particular, the new protcol implements the following changes:
- It permits write operations to be performed much more quickly through the use of a two-phase commit protocol, while still maintaining the server's stateless view of the protocol.
Asynchronous writes, as Guy guessed. He and Beepy noted that NetApp does not currently implement async writes (we accept them and note in the response that they were done synchronously) and that, with our NVRAM, they wouldn't provide a substantial benefit as they can with general purpose servers having no special hardware support.
Beyond that, "NFS Version 3 Design and Implementation" (from Summer '94 Usenix) notes that async writes "are most effective for large files." Netnews mostly generates lots of little files, so even if async writes had much value on a filer, they wouldn't have much effect for netnews.
Is this not applicable to NetApp servers?
Bottom line: no.
Unfortunately, it's the "feeder" machine (the one which is writing to the filer) that gets hit the hardest by using NFSv3. One operation that is hit particularly hard is renumbering the active file, though there are others that behave similarly.
But INND doesn't scan directories very much does it?
No, it doesn't, but other things that run on the same machine do.
I can imagine that the renumber operation would suffer but it doesn't do renumber all that frequently.
In the case of renumbering the active file, we saw one site drop from 12 hours to a few tens of minutes. Still not too horrible if you only run it once per night, except that it chews up a *lot* of resources on the filer and can have a severe impact on useful work.
Isn't it nnrpd that accounts for most of the readdir+ performance waste when using NFS v3?
*Any* operation than triggers READDIR+ may severely impact performance. It only happens from nnrpd when you change or list a group other than the one you've most recently looked at (or it's been a while). Since that's not something users do that frequently, "most" of the waste depends on how many users you have. Active file renumbering hammers at it continuously until it's done, so unless you have a pretty large number of users, I doubt nnrpd would account for "most" of the hit.
-- Karl