Contributed by Dengue on from the jimmy-o'gorman-strikes-again dept.
In this paper I am going to write about my experiences with building a mail system on FreeBSD. I hope to cover what our original goal was, mistakes we made on the way, problems we ran into, and what the end result was. I will cover the tools we used and why we choose them. This project was not my work alone, I worked with a friend and co-worker who helped me along the way, and did much of the system himself.
My employer is a regional community site providing local news to the Omaha area. Even since the beginning, it had been a plan to offer free web-based e-mail sometime in the future. No one at the company had ever done anything similar to that before, and neither had I. After hearing about these plans and volunteering for the challenge, the task landed on my lap. My first thought was to look into purchasing a commercial software package to accomplish the goal since our organization is incessantly obsessed with spending money never before had the gray hairs considered using an open source solution to fix their problem. After some investigation of commercial services and learning that, for the number of accounts that we potentially required, we would be charged upwards of $100,000, we decided to push towards rolling our own system.
We needed to support 50,000 users out of the gate and have virtually unlimited scalability.
The Back Side
Once the decision was made to grow our own solution, we just had to figure out how to do it.
One incredibly important detail was for us to stay away from requiring UNIX accounts for all of our users, reasons for which should be all too obvious. So with that in mind I looked for mail servers that supported alternative authentication methods.
After some digging around in the ports tree, Freshmeat, and Usenet we came up with a decent plan. We would use Postfix as our MTA, Cyrus as the mail server, and OpenLDAP for LDAP authentication.
We chose the Postfix and Cyrus combination for a few reasons. First, there were modifications out for them that would allow you to authenticate off of LDAP. Secondly, from the information that we ready, they appear to be pretty nice programs with quite a following (MTAs are pretty much a matter of religion, and I am a Postfix zealot. I am not really interested in arguing MTAs with anyone, but if you are thinking about choosing a MTA, give Postfix a good hard look). Cyrus fit like a glove. It has some very nice features on the administrative side and its mailbox directory structure is interesting. Because mail is stored in a quasi-maildir format, mail retrieval and indexing is very fast and not prone to corruption. Cyrus also comes with a number of great ?repair? utilities. The only potential problem is inode allocation. Treat your Cyrus partitions as you would a Usenet mail storage partition!
The Front End
At first we fooled around with Perl/POP3 based clients (such as Endymion MailMan). The problem we ran into immediately was scalability. The best performance we could muster from MailMan was around 8-10 simultaneous users.
Secondly we poked around with TWIG, a fairly full-featured IMAP client. Problems we ran into with TWIG (at that time, it was still in the 1.x tree) included some basic flaws in its PHP architecture. In order to use the TWIG core to brand our own webmail system, heavy modifications would have been needed to the code. Not to mention the fact that TWIG (at that time) made approximately three separate IMAP connections for nearly every operation.
We decided to roll our own webmail client as well, for more reasons than one. Primarily it gave us total control over the interface, we could build our own value-added features, and didn't have to worry about client licensing issues. Our client would be written in PHP, the same language as TWIG and IMP. As mentioned above, Perl obviously was out of the question.
I had read about some interesting mail routing ideas in the past and though we could build upon them.
When e-mail comes in, this is the route it will take: DNS would list the mail router as the MX server for the domain. The message would hit the mail router that would then look at who the message was for, take that information and do a LDAP lookup for the address to see if there is a accepting user for that address and then find out where to forward the mail to. Postfix would forward it to the proper mail storage server who would hand it off to Cyrus which would do another lookup to make sure which user to hand the message off to.
When the user from the webmail interface wants to log in, the interface would take the username and do an LDAP lookup, send the login and password to the correct IMAP server, who would do another LDAP lookup to make sure the name and password is correct, and then hand back a connection pointer to the PHP client.
After some internal testing, we realized that some things needed changing.
First was the LDAP authentication method. We found, through testing with TWIG, that OpenLDAP does not like lots of traffic. Even on the dual Xeon 450 server it was hosted on, it could handle 5-10 requests per second tops. Many times while using TWIG, we noticed that page generation would momentarily freeze while the LDAP server completed the query. This was absolutely unacceptable as the delays sometimes lasted up to 15 seconds (or longer).
We were aware of MySQL patches for both Cyrus and Postfix before we even chose an LDAP solution. Unfortunately, at the time, we were virtually ordered by management specifically not to use MySQL. A good lesson that we learned at this time was that ill-informed management types generally forget everything after a period of seven days. We made the switch and asked questions later after the urgency of not using MySQL was forgotten.
LDAP was now totally out of the picture, completely replaced by MySQL. Performance thus far has been absolutely stellar.
When we finally went into production the prognosis was pretty good. I along with a few others had switched to using the new mail system for all corporate our e-mail. We would use IMAP in the internal network and the web interface from remote (we blocked access to the IMAP server from outside the local LAN for obvious reasons).
The web interface that my friend was developing came along nicely. It is modular and very fast. Server stress testing has shown that we have CPU capacity for several hundred simultaneous users.
One of the modifications that we made to Postfix was using MySQL alias tables for mail delivery for multiple domains. The reason we made this change was due to the way Postfix handles remote delivery tables. For each incoming message, Postfix would run up to seven separate database queries. This is unacceptable in a high-traffic mail environment. We want one query and one query only. By searching against the whole destination e-mail address (rather than just the username part of the address), we can accommodate any number of domains and duplicates in username space and only use one database query.
Mail accounts and their respective incoming addresses are abstracted to a box number and mail server combination. For example, let us say that we have two mail addresses: firstname.lastname@example.org and email@example.com.
firstname.lastname@example.org -> email@example.com firstname.lastname@example.org -> email@example.comWhen a user logs in with their username, password, and domain information, our webmail interface's authentication system pulls their abstracted mailbox id and server location from the database and logs them in. Totally, absolutely transparent: the user is oblivious to the underlying methodology. Their mail could be stored in a server in Jakarta for all they are concerned.
Preconceived Notions and Conclusion
The only part of our system that I can find flaw in is the MySQL database itself. While I have complete faith in MySQL to handle our level of traffic, it is the only monolithic part of this architecture. Eventually, as millions of new e-mail boxes are added, the need for more and more "big iron" servers becomes apparent to feed the MySQL server. However, these changes are somewhat trivial in the grand schema of things.
Please note that I wrote this article not to give people a step-by-step guide on how to create a webmail system. Instead, I wanted to give people an idea of the concepts that we used to create ours so that they can build upon it for their own needs. No system is ever done, there is always a way to improve upon them and make them better. Hopefully someone can take what we did, improve it, and then tell the world so that we may learn from them as well and hopefully get hints on how to improve our own system.
If anyone has any comments about this system, please contact us at firstname.lastname@example.org
Jim O'Gorman (email@example.com)
Robert Bradman (firstname.lastname@example.org)
(Comments are closed)