On Mon, 18 Oct 1999, Jim McCoy wrote:
- Mail messages are small. There is a simple bimodal distribution to
internet mail message size, they are either very small or very large. The small files will cost you storage bits because the filer uses 4K to store each 500byte message. The small file sizes will also keep your read chain length very low so we will get significantly decreased performance on your backups of this volume and you will notice some decreased read performance on the filer in general.
You have valid points. Do you think that the decreased read performace is offset by the lowered necessity to read all mail each time you adjust the mailbox? I'm mostly interested in IMAP/POP type environemnts where most of the mail that is read and deleted is new mail. Old mail simply piles up. Whereas keeping old mail in one file would be great because it would save space, inodes, and speed up backups rewriting all of those messages in order to remove one from the middle would be costly in performance.
One of our problems is mailbox "corruption." Not really corruption in the usual sense, but header features which made mail readers croak, like a couple hundred (if not thousand) address "To:" lines. Sifting through individual files and unlinking a single file is easier than parsing out one huge mail spool file and programatically removing such messages. This is especially true since one must only check messages whose files were created since the last time mail was accessed sucessfully.
What do you think?
Tom