OpenBSD Journal

Setting up an anti-spam email gateway with OpenBSD

Contributed by jose on from the protecting-the-masses dept.

Scott Vintinner writes: "I've written a fairly detailed guide that describes how to setup an anti-spam email gateway on OpenBSD 3.2 using the well known SpamAssassin open-source product. The email gateway I describe is designed to be a company's smtp front door to the internet, protecting a back-end Exchange or Lotus Notes server from direct internet access, while at the same time providing anti-spam features. The anti-spam features include the SpamAssassin text and statistical analysis as well as the online SPAM database checks provided by Razor and DCC.

The guide is available here: http://lawmonkey.org/anti-spam.html " Thanks for the writeup, Scott, this looks pretty useful!

(Comments are closed)


Comments
  1. By Xenotrope () on

    Something I noticed while reading this article was the following:

    ...when we were sure that our false positive rate had been reduced as low as it could go, we switched amavisd over to Bounce mode. In this mode, the system rejects any message it diagnoses as spam. The sender of the message receives a nice email from the system telling them why their message was rejected (including the SpamAssassin score distribution) and how they can get on our whitelist.

    This is a phenomenally bad idea. While the idea of bouncing spam back to the spammer is full of a karmic satisfaction, it does not work in practice. Spammers can merely forge their return address. This alone would be bad enough to warrant scrapping the "bounce your spam" model.

    But it gets worse. The spammers who don't forge their mail envelopes get your bounce message, which you have deliberately tailored to tell them exactly who returned it, how they can get on your whitelist, and worst of all: instructions on how to fool your gateway next time . When you send them the SpamAssassin report, you are flat-out telling them "This message got flagged because you sent an HTML-only document containing the words 'money' and 'no-interest mortgage'."

    What is any decent spammer going to do? He's going to fix his message, making it incrementally harder for your e-mail system, and in turn everyone else's, to detect him the next time he sends out his junk.

    Comments
    1. By Nate () nate@my-balls.com on mailto:nate@my-balls.com

      No, because spamd accesses a list of known spam e-mails. This list is ever growing, and therefore as soon as it is submitted, others will not get the harder spam.

      The spam gets significantly reduced particularly because of the varying checks performed.

      Comments
      1. By krh () on

        Nobody is able to successfully spam more than a couple times from a single email address or IP: It'll get blocked. Eventually the spammer will switch to another email address of IP. And using the feedback he's received, he'll be able to tailor his message to make it through more spam filters.

        There's another problem with bouncing these messages. I have a friend who was the unfortunate victim of a spammer: The spammer used my friend's personal domain name in the envelope sender for a spam. My friend has the unusual configuration that he receives everything sent to his domain in his inbox (because he's the only person there, and it lets him create new email addresses without actually doing anything). So my friend got hundreds and hundreds of bounces from non-working email addresses. It would've been worse if he'd gotten bounces from working ones, too.

        Comments
        1. By Xenotrope () on

          The same thing happened to a friend of mine. He had a short e-mail address of seemingly random characters, like "djci@domain". It only lasted a couple days, but since he didn't admin the mail server, he couldn't filter out the bounce messages.

          Being able to create dozens of disposable email addresses seems like a really easy stopgap solution: forward "you-gibberish" to "you". When you start getting spam addressed to "you-gibberish", you know that the address you've used there has been farmed, and can take it out of /etc/aliases. The problem is that you have to have the technical proficiency to run your own mail server and maintain it, which most people aren't able to do.

    2. By Scott Vintinner () forge@flakshack.com on http://www.flakshack.com

      This is a phenomenally bad idea. While the idea of bouncing spam back to the spammer is full of a karmic satisfaction, it does not work in practice. Spammers can merely forge their return address. This alone would be bad enough to warrant scrapping the "bounce your spam" model.

      Yes, I recognize that spammers forge their return address. I have a process that runs that automatically deletes undeliverable bounce messages from the queue once they have failed delivery once. (I just added the description for how to do this to the directions today).

      As for your concerns that I'm teaching spammers anything, I think you're being naive. As you pointed out yourself, most spammers forge their addresses so they won't get my bounce message in the first place. Aside from that, you should keep in mind that SpamAssassin has been open source for years and their list of tests is freely available to any spammer that wants to setup their own system to test against.

    3. By Scott Vintinner () forge@flakshack.com on http://www.flakshack.com

      By the way, the SpamAssassin report included with my bounce message doesn't give any indication to the spammer how to get past the Bayes rules (which are customized differently for every site) or the DCC or Razor spam database checks. Additionally if they do manage to get by the system, my directions include a method for my users to forward this rogue message back to SA to be added to the bayes database.

  2. By Mark Beihoffer () on http://www.dragonfly-7.com

    My company really could use something like this; your article is one of the most thorough, well-written how-to guides I've seen for a long time.

    Keep up the great work!

  3. By Ron Rosson () insane@oneinsane.net on http://www.oneinsane.net/~insane

    Razor seems to have load/connectivity issues. When it fails to connect it will pass the mail along. With out any further processing and also slows down delivery.

    Comments
    1. By george () nobody@example.net on mailto:nobody@example.net

      I use razor from my ~/.procmailrc. To overcome the problems that you are pointing out, the next thing that I do is to have SpamBouncer run after razor. This has minimised the incoming spam to my INBOX.

  4. By jose () on http://monkey.org/~jose/

    i noticed two things wrong with this article. the first is that the author doesn't install postfix (or sevearl other tools) from ports, which would make postfix installation a snap (via postfix-enable) and upgrading a breeze.

    the second thing is running rdate from cron. just use the system's support for running ntpd, which always keeps the clocks within a few seconds of eachother. good tips in this story we ran previously http://www.deadly.org/article.php3?sid=20030105011352

    overall not bad, but ... use the ports tree. we built it for a reason.

    Comments
    1. By Scott Vintinner () forge@flakshack.com on http://www.flakshack.com

      Thanks jose. I actually run ntpd on my mail servers, and advised the reader that it was more accurate, but didn't want to go off on a tangent to explain how to set it up. I thought I made that clear, but maybe not.

      Also, what about concerns that ntpd is another open port on a system we need to be secure (even with the ntp security settings). My impression was that only the base install had undergone Theo's security audit, so adding other software presented a security risk. Am I just too paranoid?

      As for the ports problem, please correct me where I am wrong. In my directions, I include a line in the ports section that says

      The limitation of the ports collection is that it does not include the latest and greatest versions of all the software, so while we will us it to install a bunch of utilities, we'll install the main programs directly from the source.

      I downloaded the ports tree for the OpenBSD_3_2 patch branch using CVS. When I run a make, it appears to be installing postfix 1.1.11, not 2.0.9 (the latest version). For example:

      # cd /usr/ports/mail/postfix/
      # make ===> mail/postfix/stable ===> Checking files for postfix-1.1.11 >> postfix-1.1.11.tar.gz doesn't seem to exist on this system.

      This aspect has always confused me about the ports system. Should I have installed the -current version of the ports tree? or is that meant for OpenBSD 3.3. I've had problems in the past with mismatched ports, and was afraid of seeing them again. In any case, if you would enlighten me, I'd be happy to update my directions. As you can see from other parts of my directions, I used the ports tree to install several other parts of the system.

      Finally, in my defense, the postfix install procedure I describe is actually quite easy.

      Comments
      1. By Anonymous Coward () on

        One thing is that I believe postfix runs chroot if made out of the ports, which is a nice thing.

        Comments
        1. By Scott Vintinner () forge@flakshack.com on http://www.flakshack.com

          My directions include chrooting Postfix (and chrooting Amavisd-new, SpamAssassin, DCC and Razor).

      2. By UZbad () on

        Ditto this question. I've never understood the OpenBSD ports tree. AFAIK the FreeBSD (which I have more experience with) ports tree is the same for stable and current.

        I've been updating my ports tree to CURRENT on my OpenBSD box running 3.2, but I only use a couple of the ports so I'm not sure I'd run into problems.

        I'd love to get an answer to this.

        thanks for the article also, very interesting.

  5. By Anonymous Coward () on

    Bouncing is problematic since spammers rarely use their own return address. You're essentially sending bounce messages to people who can't do anything about it anyway, so why bother them?

    I believe it's far more productive - and ethical - to be very strict about requiring sites to follow the appropriate RFCs, and bouncing messages with a 4xx code when in doubt. If they don't properly identify themselves in the HELO message, or don't provide a valid address in the envelope, brush them off with a 4xx. If it's due to a transient DNS error, the problem should resolve itself when they retry.

    Toss in some common sense rules - slam the door on anyone claiming to be my own domain or IP address, or who claims to be coming from 192.168.1.1 or a similar reserved IP address, etc., and it's enough to block about 60-70% of the mail traffic before the DATA command.

    One final rule is to reject mail from AOL, MSN, Hotmail, Yahoo, etc. if the sending IP can't be resolved back to those domains. The small fry may usually be hosted by somebody else, but these major companies are not going to be sending out mail from some dialup or cable modem address!

    Some spam still gets through, but at that point I think it's better to just toss it into some short-lived "spam" filter than bouncing it. I would prefer to reject it with a 5xx code, but by the time the spam filter runs the SMTP connection has been dropped.

    The downside to this is that you have to periodically scan the rejection list and manually maintain an exception list. But this usually doesn't take long - it's usually safe to ignore the 1- or 2-off attempts, and the remaining senders can often clearly be identified as spam. The rest can be provisionally accepted.

  6. By Scott Vintinner () forge@flakshack.com on http://www.flakshack.com

    Just to defend the bouncing position a little... First of all, as you said, most spammers forge their address, so the bounce never reaches them. The only person that generally receive the bounce message is the rare client that accidentally triggered a false positive. Despite what you may have assumed, the undeliverable "bounce" messages are handled by a script that cleans up the deferred mail queue. The end result is a system that works very well and requires NO effort by my users.

    At a company where senior partners bill $400/hour, SPAM costs us money. Some of our partners receive an average of 400 spam messages a day. If they waste 10 minutes per day, even going through a separate SPAM folder, that's $66 per user. Add that up and you're talking really big money. By placing the burden of determining what is or is not SPAM back onto the sender, we save lots of money.

    As for the recommendation about RFC's: at my company, annoying or troubling clients is not something we want to do. In my experience, this is what bouncing messages due to RFC restrictions (and even due to RBLs) does. Not only do clients get the impression that we're blocking their mail just to be annoying (since they can send everyone else mail just fine), but our mail administrators have to spend half a week teaching other mail administrators how to correctly setup their system. This may be fine when you host your own personal mail system, but is a much different matter when your mail system deals with 75,000 messages a day or more. The truth of the matter is that most people out there can barely even setup an MX record, let alone setup everything else correctly (PTR, A, anti-relay, etc. etc.). When we initially used an RBL list, one of our biggest clients (a Fortune 500) actually refused to fix their open-relay server, even after I spent half a week explaining the problem to them. Do I wish I could use the RFC restrictions? Hell yeah! Can I use them in practice? Unfortunately no.

    From the corporate perspective, bouncing SPAM is a win-win situation. My users have their spam blocked without having to do anything...not even monitor a spam folder! The very rare client that accidentally triggers the spam rules, can reply or call us to get on our white list. The server isn't overloaded by bounce messages because we clean up the ones that can't be delivered. I just don't see why you wouldn't want to do it this way.

    Comments
    1. By krh () on

      Given the situation you describe, bouncing messages makes sense. But I think a lot of us are afraid of losing mail, and so we don't want to bounce things unless we have an absolute guarantee that they're spam. Unfortunately a statistical analysis can only tell us that something is likely spam or likely not, and while it can be very good most of the time, it will always be possible to construct messages that confuse it. Speaking for myself as an individual user, I don't want to lose that message; but from your perspective as an administrator for a firm whose partners' time is extraordinarily valuable, it doesn't matter if you lose that message.

      I have the feeling, however, that people will become accustomed to lossy filtering of spam. I personally had a procmail script for many years which tried very hard to put spam into a "suspect" folder, and eventually I didn't mind the mistakes it made because, on average, it saved me so much time. Now you're making me consider bouncing mail. Hmm. I may come around to that, too.

    2. By Anonymous Coward () on

      With Postfix, and I assume other MTAs, you can and should put the whitelist in front of the RFC checking. My volume is low enough that I can check the rejection list manually, but even I have been debating automating it by allowing site that attempts a couple reconnections to get onto the gray list. Spamming software usually doesn't try more than once or twice, while real MTAs should retry several times a day for several days.

      Another point you seem to have missed is that the RFC checking uses soft errors, not hard errors. They won't have any indication ofa problem unless the mail isn't handled for something like 5 days, when the sender will time it out. That's plenty of times for your scripts to have noticed the connection attempts and take appropriate action.

      I understand that there are some really clueless admins out there, but there is absolutely no way any legitimate site will force mail from my own domain or my own IP address (in the HELO line). That serves absolutely no purpose other than hiding the true identity of the sender, and deserves to be swatted down. Hard. Some of the other stuff, e.g., using RFC1918 addresses, is more problematic by my experience is that the sites making this error have been easily identified as spammers.

      Finally, your "spam is expensive" argument is exactly why I say mail shouldn't be bounced. Let's say it costs $10 in lost billing for every bogus message you partners get. What happens when somebody forges their email address as the sender of the spam? With my approach, they probably won't get any mail unless somebody takes the time to write them directly. With your approach, they'll get hundreds if not thousands of bounce messages that they can't do anything about, but which they have to check because of the possibility that a legitimate bounce message is in there somewhere.

      This is almost a prisoner's dilemma situation, but not quite. There's a modest cost to you if you're alone in changing your practices, but there's a huge savings to you if this new approach becomes widespread.

  7. By Kevin () krich@vela.net on http://www.vela.net

    First let me say whata a great article. I started building this from scratch and am using OpenBSD 3.3-release on i386. Everything was going smoothly until I tried running amavisd and then I get the following error:

    pyxis# /usr/local/sbin/amavisd
    ERROR: MISSING REQUIRED BASIC MODULES:
    auto::POSIX::setgid
    auto::POSIX::setuid
    BEGIN failed--compilation aborted at /usr/local/sbin/amavisd line 127.

    any ideas on where I can get the modules?

    please reply to krich@vela.net

    -kevin

    Comments
    1. By kp () on

      ditto on the quality of this article. i think it's fantastic, and the author really did lots of us a service.

      my amasisd error is the same as kevin's. i'm stuck in exactly the same way, and i'm also on the fresh 3.3 blowfish. anyone cracked this yet?

      Comments
      1. By Patryck () patryck.nospam@xs4all.nl on mailto:patryck.nospam@xs4all.nl

        Hi Scott,

        Let me first thank you for your great job, a real nice article! The lack of something like this kept me from implementing a Spam-filtering solution in our business. Anyway, using OpenBSD-3.3. there are some minor changes:

        1. According to the amavisd-new website, razor-2.22 seems to be b0rked, so i left it out. Here's what they say:

        If Mail::SpamAssassin is set to call Vipul's Razor 2.22, it fails because reading its config file (routine read_file in Razor2/Client/Config.pm) produces tainted values. You should apply the patch to Razor2/Client/Config.pm ... plus the patch2 by Vivek Khera along the same lines for Razor2/Client/Core.pm . To apply: cd to the directory /usr/lib/perl5/.../Razor2/Client/ and apply them from there: patch
        2. At the part "Installing DCC" some libs have been updated;

        /usr/lib/libc.so.28.5 becomes /usr/lib/libc.so.29
        /usr/lib/libm.so.0.1 becomes /usr/lib/libm.so.1.0

        3. OpenBSD 3.3 uses perl-5.8.0. This means you can no longer use the lines:

        auto::POSIX::setgid
        auto::POSIX::setuid

        leaving these out will solve the problem some people reported, getting an error like:

        pyxis# /usr/local/sbin/amavisd
        ERROR: MISSING REQUIRED BASIC MODULES:
        auto::POSIX::setgid
        auto::POSIX::setuid
        BEGIN failed--compilation aborted at /usr/local/sbin/amavisd line 127.

        Well, that's it! Hopefully you can update your howto, because it's a nice piece of work! If you want to make it based on 3.3, don't forget to adjust the cvsup-part. Just in case.

        Keep up the good work,

        Patryck.

  8. By Rossen Raykov () Rossen.Raykov@CognicaseUSA.com on mailto:Rossen.Raykov@CognicaseUSA.com

    #!/bin/sh
    DEFERDIR=/var/spool/postfix/deferred

    # collect the filenames
    mailq |grep MAILER-DAEMON | cut -f1 -d ' ' > $TMPFILE

    for DEFERFILE in `mailq |grep MAILER-DAEMON | cut -f1 -d ' '`
    do
    FILEPATH=`find $DEFERDIR -name $DEFERFILE`
    if [ "A$FILEPATH" != "A" ] ; then
    egrep -i 'X-Spam-Status: *Yes, *hits=[0-9]{1,2}.[0-9]' $FILEPATH > /dev/null
    if [ $? -eq 0 ]
    then
    # deferred message is most likely spam
    postsuper -d $DEFERFILE deferred
    fi
    fi
    done

  9. By Pete () prussell@mteliza.com.au on mailto:prussell@mteliza.com.au

    what a GREAT effort Scott to firsty write and then maintain this magnificent article. THANKS!

    I installed yesterday and today, starting with a fresh OpenBSD install.

    I can turn on an off the AV Code but the anti spam code i cannot switch on - i have checked and rechecked each and every line of your article and cannot get the spam code to be enabled - so mail the test message is sent and relayed to my SMTP server without being tested as spam.

    I ahve also removed the # from the content_filter line in main.cf for postfix.

    has anyone else experienced this or can suggest anything to test or check to get it working?

    Comments
    1. By Pete () on

      I have more detail at work - so i have added it here, in the hope it will help you help me. :)

      Hi - I have installed a fresh Open BSD 3.3 , amavis-new, postfix and spamassassin, following this guide http://lawmonkey.org/anti-spam.html

      Everything went well except for one thing, when i run /usr/local/sbin/amavis debug the Anti Spam Code is listed as NOT LOADED.

      I have uncommented this line in /postfix/main.cf content_filter = smtp-amavis:[127.0.0.1]:10024

      I ahve commented and uncommented this line to turn virus checking on and off - which works as expected
      @bypass_virus_checks_acl = qw( . );

      and i have ensured that is commented
      @bypass_spam_checks_acl = qw( . );

      I have checked every single entry and command in that guide to ensure i followed it accurately - does anyone have any suggestions on what to check to get the Spam detection turned ON ?

      Currently seems like mail is just relayed, not much else happening (i dont have AV installed yet)

      Kind regards and thanks in advance
      Pete

  10. By Scott Fraser () sfraser@mentoring.ws on mailto:sfraser@mentoring.ws

    First off, thanks for the white-paper. My question is, how hard is it to use this set-up for a standard mail server?
    ie: local delivery

    I have an OpenBSD 3.3 box, with Postfix, and ALL the recommend software, downloaded and install. I tried to configure it to do local delivery and for the life of me, can't get it working.

    If someone, has example files for a working system, I'd love a chance to look them over.

    Cheers,
    Scott

  11. By shad0w () on

    SpamAssassin has configuration options to modify mail body, but they seem to be ignored.

    * amavisd-new never modifies mail body or let SA do it. All mail (header) editing is done by amavisd-new and not by SA. Even though SA does observe options in its configuration file to rewrite mail body and modify mail header, the result is purposely not used by amavisd-new

    Comments
    1. By Neco () neco@c54.se on mailto:neco@c54.se

      I have the same issue... can anyone shed some light?

  12. By tom () on

    Are theses scripts double counting messages? Counting each message once when recieved from the outside world and once again whe recieved back from amavisd? I notice that EVERY recipinet i have recieves an even # of messages everyday. This seems like way too much of a coincednce (sp?).

    Thanks scott for the step-by-step, the system has been a big hit at my office.

  13. By Carsten () on

    Please include in your manual, that the berkeley DB and DB_File needs to be installed for Bayes to work.

  14. By Lukas Frei () webmaster@nextron.ch on mailto:webmaster@nextron.ch

    Hello. thanks for your great document. i am very interested to set this up on our infrastructure. only one thing i did not really find uppon first scan of the document: how is the DNS MX stuff handled? i assume this machine gets the primary mx. according to the /etc/postfix/transport-file i would in there put in each domain that it should forward email for? also in the /etc/postfix/main.cf somewhere?

    this, as we are a small isp and host about 800 domains. i would be most pleased with a general setting like 'forward all incoming mail to xxx.xxx.xxx.xxx', which would be the internal mailserver.

    is there a way to do this somehow? sorry, but i am not that much of a unix-system specialist to know these things... (setup would of course be done by somebody who knows... :)

    thanks a bunch for help and greets from switzerland!

Latest Articles

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]