It really seems to be related to the high volume of traffic generated by moving /var/mail over to the filer. Note that we use dot-locking only and therefore are not troubled by the well-known problems with fcntl-based locking over NFS.
What problems, other than buggy lock managers that no longer exist in Solaris 2.6?
Umm... unless I'm waaaaay off base, these numbers seem outrageously low. I'm kinda boggled. I was going to let this pass, but no, I must comment.
In 1995-6, even with buggy lock managers, we ran the mail for a certain large ISP (~25,000 users) here in town. We pumped an average of 450,000 mail messages a day through a pair of dual-85Mhz Sparc20's running SunOS 4.1.4, connected via FDDI to a single F330. <Pause. Allow that to sink in. 85Mhz processors. *SunOS 4.1.4*. FDDI. A single F330 w/8MB of NVRAM. Those are museum pieces nowadays.>
One box handled incoming SMTP and all local delivery (sendmail+procmail) and POP, the other handled some unholy number of Majordomo lists. I mean, this was with hardware that nowadays people scoff at, running ancient software releases and NFS V2 over a single network interface. I find it impossible to believe that a 4-way UltraII and an F760 can't handle the load. (Yeah, the average mail message with all the crap "modern" mailers add on is probably several Kb larger, but that's almost never the real bottleneck.)
Some questions that may be a bit obvious: Is your 2.6 box patched to current levels? Why NFS V2? How much NVRAM in the filer? Is the network bogged down, or the switch misconfigured? How many network interfaces, of what type/speed? What does snoop/tcpdump/your network sniffer say? How many disks in the volume that contains /var/mail? Is this filer running CIFS & NFS, or is it purely an NFS box? If /var/mail is on the filer, what about home directories (i.e., is your mailer bogging doing ".forward" lookups)? Are you using NIS and/or DNS on the filer? What version of sendmail/qmail/whatever? Are you using procmail for delivery? Why not? :-) Is disk scrubbing kicking in? Are you taking snapshots too frequently? Do you have cron jobs walking the filesystem or other network clients hitting the filer or the network when these slowdowns are observed on the mail server? And really, why NFS V2?
(I'm not really a smartass, I just play one on the Internet. :-) These are just some of the questions I'd look to answer before poking at /etc/system. Given that I've seen Sun+Netapp equipment two generations older doing three times the volume you're seeing, I can't believe there isn't some other underlying problem that's plaguing your setup. It's precisely because we pounded the snot out of that old gear and it handled the load that I still recommend and use it to this day. (Well, not SunOS 4.1.4, R.I.P., or that there's enough money in the world to convince me to ever, ever work for an ISP again! :-) I'm very curious to know what the solution turns out to be.
-- Chris
-- Chris Lamb, Unix Guy MeasureCast, Inc. 503-241-1469 x247 skeezics@measurecast.com