On Mon, 18 Oct 1999 tkaczma@gryf.net wrote:
You have valid points. Do you think that the decreased read performace is offset by the lowered necessity to read all mail each time you adjust the mailbox?
On our business virtual hosting servers, customers are allowed to keep as much mail as they want (well, up to a per-customer hard limit of 1 gigabyte). There are users with two or three hundred-megabyte mailbox files who have Eudora or Outlook set to check for new mail every 5 minutes. Depending on what else is going on (the servers also handle web hosting), each POP3 process may only be able to pull in 1 or 2 megabytes per second. When you have a thousand domains per server, and tens to hundreds of users per domain, the filers literally spend 95% of their CPU scanning mailboxes that, for the most part, don't change very much.
Switching to a split mailbox format will absolutely help my situation. The time needed to generate a mailbox index is now proportional to the number of messages in the mailbox, not the size of the mailbox. With average message size skyrocketing (don't people send e-mail without 100K attachments anymore?!), this makes sense. I'm happy because I get back 90% of my filer's CPU and NFS ops. The customer is happy because now it only takes them a few seconds to check for new messages, not a couple of minutes.
Whereas keeping old mail in one file would be great because it would save space, inodes, and speed up backups rewriting all of those messages in order to remove one from the middle would be costly in performance.
I haven't done any before-and-after comparisons yet, but I imagine incremental or differential backups would go much faster with split mailboxes. A user with a 100MB monolithic mailbox who receives a couple of new messages during the day will require the entire mailbox to be backed up to tape. With a split mailbox, only those two messages need to be sent to tape. OTOH, deleted messages will be more difficult to track, unless your backup software can keep tabs on which files have disappeared or been renamed since the last full backup.