Dear list,
I would like to perform a survey of past and current experiences with deploying and maintaining an IMAP+POP maildir cluster (e.g. Courier-IMAP), backed by Netapp filers, with > 50k mailboxes. Especially of interest is any problems you faced, how you solve them, recommended design decisions, common pitfalls, etc. You can e-mail the list or me privately, whichever you think is more appropriate.
The reason is this: I'm having problems with _my_ maildir setup on Netapp, I have a feeling there's something wrong in my setup somewhere that I can't figure out yet, and searching for any clues you might have from past experiences, knowledge (esp. you Netapp engineers out there), etc.
Sorry if all this sounds too general, I have a lot of things in my mind and hope this would be good for starters, for in-list or private discussions.
Thanks! :)
--mendonan
Senandung Mendonan mendonan@absolute-p.ath.cx writes:
I would like to perform a survey of past and current experiences with deploying and maintaining an IMAP+POP maildir cluster (e.g. Courier-IMAP), backed by Netapp filers, with > 50k mailboxes. Especially of interest is any problems you faced, how you solve them, recommended design decisions, common pitfalls, etc. You can e-mail the list or me privately, whichever you think is more appropriate.
I'd be interested in this as well, though we're doing an order of magnitude less users here. Using qmail-ldap MTA running on a handful of 1U boxes, Maildirs on the NetApp, qmail-pop3d and courier-imap for retrieval.
Only problem we see currently is under artificially heavy load to a single recipient the quota file can get hosed, qmail-local delivery hangs, NFS usage goes way up. If we kill those hung local deliveries, the mail then gets delivered so nothing lost, and the quota file gets recomputed and recreated. I expect it's because more than one box is contending for the one quota file, unlike maildir messages that are one-file-per-message and uniquely named to prevent contention.
So if others are using NetApp for Maildirs, I'd like to hear about your experiences, too.
On Mon, Apr 26, 2004 at 04:28:21PM -0400, Chris Shenton wrote:
So if others are using NetApp for Maildirs, I'd like to hear about your experiences, too.
We are.
One of the hackeries I did was from a patch someone else had done to Courier-IMAP for the POP server.
It likes to 'read the message 1 byte at a time' to verify the size.
I changed all occurences of that to instead just use stat() on the file and return the size.
This dropped the usage over NFS from 160Mbps to just over 20Mbps over the gig interface.
30K mailboxes, Postfix using Maildir storage, Courier-IMAP with the above 'fix', servicing POP and IMAP to our customers.
On Mon, Apr 26, 2004 at 04:27:55PM -0500, Mike Horwath wrote:
On Mon, Apr 26, 2004 at 04:28:21PM -0400, Chris Shenton wrote:
So if others are using NetApp for Maildirs, I'd like to hear about your experiences, too.
We are.
One of the hackeries I did was from a patch someone else had done to Courier-IMAP for the POP server.
It likes to 'read the message 1 byte at a time' to verify the size.
I changed all occurences of that to instead just use stat() on the file and return the size.
This dropped the usage over NFS from 160Mbps to just over 20Mbps over the gig interface.
30K mailboxes, Postfix using Maildir storage, Courier-IMAP with the above 'fix', servicing POP and IMAP to our customers.
Even better - hack the LDA to encode the size into the name of the message like './new/1082945416.7901_0.a.lds,S=2161' which allows the pop server to build the UIDL and LIST directly from READDIR. No stats, no opens, no reads. It's a beautiful thing. Users are able to load boxes > 100k messages in fraction of seconds with virtually no load on the NetApss.
This hack breaks RFC compliance (by not couting newlines as two bytes) but we've been running like this for >1yr without any complaints on about 40k boxes.
You can, of course, hack the LDA to scan the messages and insert the RFC compliance size if you feel so inclined.
Other important hacks to courier are to make it batch moves and unlinks until the end of the session, allowing for faster logins. Do any box maintenance like moving from /new to /cur or removal of messages AFTER the client has disconnected. I also recomend truncating the hostname in the maildir message to save on directory space. Our only concern with Maildir on the netapp is that a user could reach the max_directory size and strange things would start to happen.
The result is faster response to the customer, less load on the filers and better all around happiness for everyone involved. Why courier doesn't run like this out of the box -- I don't know. It's so obiously the best way to do it.
On Mon, 26 Apr 2004, Kelsey Cummings wrote:
Even better - hack the LDA to encode the size into the name of the message like './new/1082945416.7901_0.a.lds,S=2161' which allows the pop server to build the UIDL and LIST directly from READDIR. No stats, no opens, no reads. It's a beautiful thing. Users are able to load boxes > 100k messages in fraction of seconds with virtually no load on the NetApss.
Care to share your patch? Or at least some clue of what to hack?
Other important hacks to courier are to make it batch moves and unlinks until the end of the session, allowing for faster logins. Do any box maintenance like moving from /new to /cur or removal of messages AFTER the client has disconnected. I also recomend truncating the hostname in the maildir message to save on directory space. Our only concern with Maildir on the netapp is that a user could reach the max_directory size and strange things would start to happen.
And this patch too?
--mendonan "Yang mimpikan secangkir kopi panas dengan selimut.." (Dreaming of a cup of hot coffee, and a blanket..")
On Tue, Apr 27, 2004 at 08:00:09AM +0800, Senandung Mendonan wrote:
On Mon, 26 Apr 2004, Kelsey Cummings wrote:
Even better - hack the LDA to encode the size into the name of the message like './new/1082945416.7901_0.a.lds,S=2161' which allows the pop server to build the UIDL and LIST directly from READDIR. No stats, no opens, no reads. It's a beautiful thing. Users are able to load boxes > 100k messages in fraction of seconds with virtually no load on the NetApss.
Care to share your patch? Or at least some clue of what to hack?
Look for the maildir++ patch for Postfix local/virtual delivery agents.
On Mon, Apr 26, 2004 at 04:27:55PM -0500, Mike Horwath wrote:
On Mon, Apr 26, 2004 at 04:28:21PM -0400, Chris Shenton wrote:
So if others are using NetApp for Maildirs, I'd like to hear about your experiences, too.
Did I say it was the only way to go?
On Mon, 26 Apr 2004, Mike Horwath wrote:
One of the hackeries I did was from a patch someone else had done to Courier-IMAP for the POP server.
It likes to 'read the message 1 byte at a time' to verify the size.
I changed all occurences of that to instead just use stat() on the file and return the size. This dropped the usage over NFS from 160Mbps to just over 20Mbps over the gig interface.
Can you share your patch? Was it based on this patch:-
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&...
30K mailboxes, Postfix using Maildir storage, Courier-IMAP with the above 'fix', servicing POP and IMAP to our customers.
--mendonan "Yang mimpikan secangkir kopi panas dengan selimut.." (Dreaming of a cup of hot coffee, and a blanket..")
On Tue, Apr 27, 2004 at 07:57:47AM +0800, Senandung Mendonan wrote:
Can you share your patch? Was it based on this patch:-
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&...
I patched it by hand once I learned of this misfeature.
Sorry :(
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Chris Shenton wrote: | So if others are using NetApp for Maildirs, I'd like to hear about | your experiences, too.
My experience has been that a netapp (F840, ~1/2T in my case) has no problem at all with 200,000+ mailboxes. Postfix for the MTA, Maildir delivery. We utilize a cluster of FreeBSD (4.9) systems, about 10 systems for smtp delivery, and 3 systems for imap.
The slowest thing that I deal with in my mail system is the time it takes to filter the email for spam/virii.
- -Gabriel
On Tue, 27 Apr 2004, Gabriel wrote:
My experience has been that a netapp (F840, ~1/2T in my case) has no problem at all with 200,000+ mailboxes. Postfix for the MTA, Maildir delivery. We utilize a cluster of FreeBSD (4.9) systems, about 10 systems for smtp delivery, and 3 systems for imap.
The slowest thing that I deal with in my mail system is the time it takes to filter the email for spam/virii.
Question: can you share (in general) what you use to filter spam/virii? Were they in place the same or before the Netapp deployment?
I'm figuring out if stopping mail from hitting the Netapp in the first place (discard spam and virus mails at the MTA level e.g. via sendmail-milter+dnsbl+amavisd+clamav) will greatly help alleviate filer performance woes.. and hopefully get some actual figures of "before" & "after" stats..
Thanks all who have responded so far: it helped me (perhaps others too) greatly.. :)
--mendonan "Yang mimpikan secangkir kopi panas dengan selimut.." (Dreaming of a cup of hot coffee, and a blanket..")
We solved our Spam / Virus issues and mail load on our filers by installing a pair Spam/Virus firewalls in front of our mail cluster. The product is by a company call Barracuda Networks www.barracudanetworks.com. It's really simple to setup and works well with exchange. With this setup the spam/virus firewall scans for spam & viruses and then forwards the message to our mail cluster.
Best regards,
Blake Folgner
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of Senandung Mendonan Sent: Tuesday, April 27, 2004 7:54 PM To: Gabriel Cc: Chris Shenton; toasters@mathworks.com Subject: Re: REQ: Survey: Maildir on Netapp Filers
On Tue, 27 Apr 2004, Gabriel wrote:
My experience has been that a netapp (F840, ~1/2T in my case) has no problem at all with 200,000+ mailboxes. Postfix for the MTA, Maildir delivery. We utilize a cluster of FreeBSD (4.9) systems, about 10 systems for smtp delivery, and 3 systems for imap.
The slowest thing that I deal with in my mail system is the time it takes to filter the email for spam/virii.
Question: can you share (in general) what you use to filter spam/virii? Were they in place the same or before the Netapp deployment?
I'm figuring out if stopping mail from hitting the Netapp in the first place (discard spam and virus mails at the MTA level e.g. via sendmail-milter+dnsbl+amavisd+clamav) will greatly help alleviate filer performance woes.. and hopefully get some actual figures of "before" & "after" stats..
Thanks all who have responded so far: it helped me (perhaps others too) greatly.. :)
--mendonan "Yang mimpikan secangkir kopi panas dengan selimut.." (Dreaming of a cup of hot coffee, and a blanket..")
On Tue, 27 Apr 2004, Blake Folgner wrote:
We solved our Spam / Virus issues and mail load on our filers by installing a pair Spam/Virus firewalls in front of our mail cluster. The product is by a company call Barracuda Networks www.barracudanetworks.com. It's really simple to setup and works well with exchange. With this setup the spam/virus firewall scans for spam & viruses and then forwards the message to our mail cluster.
Back to my question: after the spam/virus firewalls deployed, in addition to the obvious protection, did you find the load in your mail servers and/or filers to decrease dramatically?
Thanks for sharing your experience...
--mendonan "Yang mimpikan secangkir kopi panas dengan selimut.." (Dreaming of a cup of hot coffee, and a blanket..")
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Senandung Mendonan wrote: | Question: can you share (in general) what you use to filter spam/virii? | Were they in place the same or before the Netapp deployment?
Sure. We use spamassassin and amavis-new for scanning. Scanning takes place on each mail server, with local (memory-filesystem-based) storage for the temporary files. The only time data hits the netapp is for final delivery of the message.
| I'm figuring out if stopping mail from hitting the Netapp in the first | place (discard spam and virus mails at the MTA level e.g. via | sendmail-milter+dnsbl+amavisd+clamav) will greatly help alleviate filer | performance woes.. and hopefully get some actual figures of "before" & | "after" stats..
You are indeed correct. You quite obviously don't want to touch the netapp unless you have to do so; nfs is expensive.
Gabriel - -- Gabriel Cain www.dialupusa.net Senior Systems Administrator gabriel@dialupusa.net PGP fingerprint: C0B4 C6BF 13F5 69D1 3E6B CD7C D4C8 2EA4 2B08 1C6D
Technology for the sake of business.