Adventures in Building a Scalable Mail System on FreeBSD

Contributed by Dengue on 2000-02-02 from the jimmy-o'gorman-strikes-again dept.

james o'gorman dropped me a line earlier today, he must have sensed I was in need of some material for the new site, and kindly sent an article detailing creating a production web-based messaging system using IMAP, Php, and FreeBSD.

In this paper I am going to write about my experiences with building a mail system on FreeBSD. I hope to cover what our original goal was, mistakes we made on the way, problems we ran into, and what the end result was. I will cover the tools we used and why we choose them. This project was not my work alone, I worked with a friend and co-worker who helped me along the way, and did much of the system himself.

The Objective

My employer is a regional community site providing local news to the Omaha area. Even since the beginning, it had been a plan to offer free web-based e-mail sometime in the future. No one at the company had ever done anything similar to that before, and neither had I. After hearing about these plans and volunteering for the challenge, the task landed on my lap. My first thought was to look into purchasing a commercial software package to accomplish the goal since our organization is incessantly obsessed with spending money never before had the gray hairs considered using an open source solution to fix their problem. After some investigation of commercial services and learning that, for the number of accounts that we potentially required, we would be charged upwards of $100,000, we decided to push towards rolling our own system.

We needed to support 50,000 users out of the gate and have virtually unlimited scalability.

The Back Side

Once the decision was made to grow our own solution, we just had to figure out how to do it.

One incredibly important detail was for us to stay away from requiring UNIX accounts for all of our users, reasons for which should be all too obvious. So with that in mind I looked for mail servers that supported alternative authentication methods.

After some digging around in the ports tree, Freshmeat, and Usenet we came up with a decent plan. We would use Postfix as our MTA, Cyrus as the mail server, and OpenLDAP for LDAP authentication.

We chose the Postfix and Cyrus combination for a few reasons. First, there were modifications out for them that would allow you to authenticate off of LDAP. Secondly, from the information that we ready, they appear to be pretty nice programs with quite a following (MTAs are pretty much a matter of religion, and I am a Postfix zealot. I am not really interested in arguing MTAs with anyone, but if you are thinking about choosing a MTA, give Postfix a good hard look). Cyrus fit like a glove. It has some very nice features on the administrative side and its mailbox directory structure is interesting. Because mail is stored in a quasi-maildir format, mail retrieval and indexing is very fast and not prone to corruption. Cyrus also comes with a number of great ?repair? utilities. The only potential problem is inode allocation. Treat your Cyrus partitions as you would a Usenet mail storage partition!

The Front End

At first we fooled around with Perl/POP3 based clients (such as Endymion MailMan). The problem we ran into immediately was scalability. The best performance we could muster from MailMan was around 8-10 simultaneous users.

Secondly we poked around with TWIG, a fairly full-featured IMAP client. Problems we ran into with TWIG (at that time, it was still in the 1.x tree) included some basic flaws in its PHP architecture. In order to use the TWIG core to brand our own webmail system, heavy modifications would have been needed to the code. Not to mention the fact that TWIG (at that time) made approximately three separate IMAP connections for nearly every operation.

Lastly, we hit upon the IMP project from Horde. IMP had very similar features to TWIG but required very crazy JavaScript, a 4.x browser, and once again required a PhD in PHP to make any worthwhile interface changes.

We decided to roll our own webmail client as well, for more reasons than one. Primarily it gave us total control over the interface, we could build our own value-added features, and didn't have to worry about client licensing issues. Our client would be written in PHP, the same language as TWIG and IMP. As mentioned above, Perl obviously was out of the question.

Mail Routing

I had read about some interesting mail routing ideas in the past and though we could build upon them.

When e-mail comes in, this is the route it will take: DNS would list the mail router as the MX server for the domain. The message would hit the mail router that would then look at who the message was for, take that information and do a LDAP lookup for the address to see if there is a accepting user for that address and then find out where to forward the mail to. Postfix would forward it to the proper mail storage server who would hand it off to Cyrus which would do another lookup to make sure which user to hand the message off to.

When the user from the webmail interface wants to log in, the interface would take the username and do an LDAP lookup, send the login and password to the correct IMAP server, who would do another LDAP lookup to make sure the name and password is correct, and then hand back a connection pointer to the PHP client.

Things Considered

After some internal testing, we realized that some things needed changing.

First was the LDAP authentication method. We found, through testing with TWIG, that OpenLDAP does not like lots of traffic. Even on the dual Xeon 450 server it was hosted on, it could handle 5-10 requests per second tops. Many times while using TWIG, we noticed that page generation would momentarily freeze while the LDAP server completed the query. This was absolutely unacceptable as the delays sometimes lasted up to 15 seconds (or longer).

We were aware of MySQL patches for both Cyrus and Postfix before we even chose an LDAP solution. Unfortunately, at the time, we were virtually ordered by management specifically not to use MySQL. A good lesson that we learned at this time was that ill-informed management types generally forget everything after a period of seven days. We made the switch and asked questions later after the urgency of not using MySQL was forgotten.

LDAP was now totally out of the picture, completely replaced by MySQL. Performance thus far has been absolutely stellar.

Production

When we finally went into production the prognosis was pretty good. I along with a few others had switched to using the new mail system for all corporate our e-mail. We would use IMAP in the internal network and the web interface from remote (we blocked access to the IMAP server from outside the local LAN for obvious reasons).

The web interface that my friend was developing came along nicely. It is modular and very fast. Server stress testing has shown that we have CPU capacity for several hundred simultaneous users.

One of the modifications that we made to Postfix was using MySQL alias tables for mail delivery for multiple domains. The reason we made this change was due to the way Postfix handles remote delivery tables. For each incoming message, Postfix would run up to seven separate database queries. This is unacceptable in a high-traffic mail environment. We want one query and one query only. By searching against the whole destination e-mail address (rather than just the username part of the address), we can accommodate any number of domains and duplicates in username space and only use one database query.

Mail accounts and their respective incoming addresses are abstracted to a box number and mail server combination. For example, let us say that we have two mail addresses: bob@domainone.org and bob@domaintwo.org.

bob@domainone.org -> box47291@mailservice202.domain.com

bob@domaintwo.org -> box24992@mailservice762.domain.com

When a user logs in with their username, password, and domain information, our webmail interface's authentication system pulls their abstracted mailbox id and server location from the database and logs them in. Totally, absolutely transparent: the user is oblivious to the underlying methodology. Their mail could be stored in a server in Jakarta for all they are concerned.

Preconceived Notions and Conclusion

The only part of our system that I can find flaw in is the MySQL database itself. While I have complete faith in MySQL to handle our level of traffic, it is the only monolithic part of this architecture. Eventually, as millions of new e-mail boxes are added, the need for more and more "big iron" servers becomes apparent to feed the MySQL server. However, these changes are somewhat trivial in the grand schema of things.

Please note that I wrote this article not to give people a step-by-step guide on how to create a webmail system. Instead, I wanted to give people an idea of the concepts that we used to create ours so that they can build upon it for their own needs. No system is ever done, there is always a way to improve upon them and make them better. Hopefully someone can take what we did, improve it, and then tell the world so that we may learn from them as well and hopefully get hints on how to improve our own system.

If anyone has any comments about this system, please contact us at mailsystem@omaha.com

Written by:

Jim O'Gorman (jameso@elwood.net)
Robert Bradman (rbradman@omaha.com)

(Comments are closed)

Comments

By James A. Mutter () jmutter at ds dot net on 2000-02-02 06:36 http://daily.daemonnews.org/

I liked the article but I'm curious, what type of hardware did you need to throw at this system to make it work? Specifically, what are the physical requirements to make this happen?
Comments
1. By jacob (212.49.82.59) on 2006-04-26 15:52
  
  why talk of ldap postfix installation while you end uo\p not using ldap drop it

Latest Articles

Fri, Jul 11
- 09:15 watch(1) utility added to -current (0)
Sat, Jul 05
- 08:17 KDE Plasma 6.4 has landed in OpenBSD (0)
- 08:13 Blink and you'll miss it! 4096 colours and flashing text on the console! (2)
- 08:08 Game of Trees Hub now taking signups for repository hosting (0)
Sat, Jun 28
- 05:57 Game of Trees 0.115 released (0)
Tue, Jun 24
- 07:48 Game of Trees 0.114 released (0)
- 07:23 Call for testing: bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix (0)
Mon, Jun 16
- 08:22 j2k25 hackathon report from kn@: installer, low battery, and more (0)
Fri, Jun 13
- 11:18 dhcpd(8): use UDP sockets instead of BPF (1)

Credits

Copyright © 2004-2008 Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to April 2nd 2004 as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]