OpenBSD Journal
Home : : Add Story : : Archives : : About : : Create Account : : Login :
Passive Aggressive Spam Filtering
Contributed by sean on Sat Jul 18 03:41:46 2009 (GMT)
from the squashing-the-low-hanging-fruit dept.

Using OpenBSD and spamd for spam filtering and grey-listing is very old news but there are a few situations where it becomes politically and technically challenging to run in production. Here was a simple yet (and in no way the best method) of using PF and some friends on the internet to help 'slow the flow' of offal from the Internet into your mail server.

The spamd application is fantastic. It does a great job at giving back to the spammers that annoy us. It does have some flaws that in certain environments leave us to find other solutions.

For instance some web-mail services (like Gmail) rarely re-deliver on delay from the same server making it exceedingly difficult to put their network into a white-list and to make matters worse there are so many outgoing MTA IPs that seem to change faster than you can collect them. The second nail in my 'default spamd' coffin is dealing with clients that use broken or 'I clicked till it worked' installations of Exchange services. I don't know how many times some random client would get caught in the tar pit and lead to many last minute and out of band support calls (usually while I was indisposed and away from any form of SSH capable device). One client in particular fiddled with their stuff so often I just white-listed the entire net block the whois advertised that just to make them shut up.

Needless to say, spam piled up at a ridiculous rate. I had to leave my mail client open 24x7 to filter the flood in my own mail account. At one point we were receiving 100,000+ pieces of spam per day across our low triple digit user base. The mail server couldn't hack it. Leaving your machine off for the weekend made your account useless on Monday morning and deus-ex-machina help you if you went on vacation.

Thunderbird's spam filter is crap but helped some users. Another help for the more 'popular' users was Mail Washer and that helped but eventually we just gave up trying and secretly turned spamd back on and waited for the complains before we turned it off again.

Spamassassin wasn't an option as the mail server was already over-subscribed and heavily internal mailing list use (mailman) and aggressive IMAP clients would render the machine almost unusable. The record load was 140 lasting for a good hour or so. Of course like most fiscally aggressive companies, budget for a more reasonable mail infrastructure to scale with the amount of users we now have will only become available when the machine up and dies good and proper. The urge to do what we can with what we have is frustrating but I love a good challenge.

We even flirted with the idea of using a different grey-lister solution that was less elegant than spamd but would allow for domain name based white listing and was essentially a front end filter to sendmail. That's not even addressing the rather high frequency of false positives from bulk-email services (Gmail, Yahoo Mail, Hotmail, etc.). I really didn't want to go down this road since I'm a fan of sticking with 'base' as aggressively as possible.

After a particularly frustrating Monday morning, and fielding a question about available IP based spam filtering for 'the Linux' a solution became all to clear. Not everything has to go through spamd (or at all) and we don't need grey-listing. All we really need to do is collect various 'black and gray lists' from others and tar pit them (with periodic flushes). The other advantage here is that it can be extended to work on systems that don't have spamd or PF (or in conjunction with spamd... 'for extra win').

So I went searching around for a bunch of black and grey-lists and collected a nice little list, and then wrote a stupid script to pull them and pop them into a black-list pf table which just drops all traffic from those hosts (or alternatively redirecting them to the spamd tar pit). The results were mind blowing. No other form of spam filtering is being done, and my personal spam to ham ratio went from 1000:1 to 1:1000. I left this solution in place for a good while (about a year and a bit) without complaints (and surprisingly few user noticed given the volume of complains) and I no longer field those horrendous support calls nor random meetings with client IT personnel explaining why they can't send us mail.

The solution here is pretty simple. Use a normal spamd config and instead of redirecting everything save spamd-white entries just send anything in the black-list to spamd (or just drop it entirely) and if it gets out of the tar pit... that's tolerable (loggable, and easy to monitor). I call it 'social passive aggressive filtering.'

Even if your filtering systems are not OpenBSD based the method stays the same.
  1. Gather your blacklists from around the net.
  2. Translate them into a format your filter/firewall can understand.
  3. Load them in.
  4. Every regular period flush the blacklist and go back to one.
The configurations are pretty trivial for OpenBSD & PF. Note this is the first attempt and the 'wrong way' to do this. Well it works but it isn't the best solution (more on that below). /etc/pf.conf
table <whitelist> persist file "/etc/spamd/whitelist.txt"
table <blacklist> persist file "/etc/spamd/traplist.txt"
set limit table-entries 200000

rdr pass from <blacklist> to port smtp -> localhost port spamd
#If you can't get away with tar-pitting the jerks just dump'em!
#pass quick from <whitelist> to port smtp
#block drop from <blacklist> to port smtp
/root/bin/update_traplist.sh
#!/bin/sh
# Don't care about dup's as the pf table add will select uniques.
$LIST=/tmp/spamd-black.txt.tmp
lynx --source http://www.openbsd.org/spamd/traplist.gz >> $LIST
lynx --source http://www.openbsd.org/spamd/spews_list_level1.txt.gz >> $LIST
lynx --source http://www.openbsd.org/spamd/spews_list_level2.txt.gz >> $LIST
lynx --source http://www.openbsd.org/spamd/SBL.cidr.gz >> $LIST
lynx --source http://www.openbsd.org/spamd/chinacidr.txt.gz | awk '{print $1}' >> $LIST
lynx --source http://www.openbsd.org/spamd/koreacidr.txt.gz | awk '{print $1}'| sed ';' ' ' >> $LIST
# Column 2 is all we care about here.
lynx --source http://www.ix.de/nixspam/nixspam.blackmatches | awk '{print $2}' >> $LIST
# Sometimes stray semi-colons show up and bung up the works.
lynx --source http://www.bsdly.net/~peter/bsdly.net.traplist | sed ';' ' ' >> $LIST 
lynx --source http://www.spamhaus.org/DROP/drop.lasso | sed ';' ' ' | awk '{print $1}' >>$LIST
mv $LIST /etc/spamd/traplist.txt
# Here's the magic!
pfctl -t blacklist -T replace -f /etc/spamd/traplist.txt 2>/dev/null
The peroidic part is easy... just run this script in root's crontab!
0       */6     *       *       *       /root/bin/update_traplist.sh 2>/dev/null

For the purposes of the article I tracked the number of blocked IP's on my busiest two mail servers and the following pretty pictures illustrate the variability of the spam hosts in the above list.

This by no way is a good, clever or novel solution, it was just one that made a HUGE impact on the users I service and an even bigger impact on the abuse of our overloaded mail server. This is also a pretty old solution, but on the odd chance someone hasn't come to this conclusion here is something else to try.

I would love to say 'well we don't want to work with people that can't fix their own mail server.' but that's not good for business. Note the emphasis here is on blocking the low hanging fruit as seen by other less constrained networks. Of course it would be prudent to note that the lists you chose should come from systems you trust and using white-lists to ensure some VIPs and your own networks are not black-holing is probably a good idea too.

Part 2... I'm an idiot.

After all this work and effort a casual look at the man page shows that spamd in blacklisting mode does exactly all of this and better. Insert face-palm here. The solution was right in front of my face the whole time and ended up over-engineering my solution.

The 'right way' to solve this problem is very similar:

/etc/pf.conf
table <spamd-white> persist file "/etc/spamd/whitelist.txt"
table <spamd> persist;"
set limit table-entries 200000

rdr pass from <spamd> to port smtp -> localhost port spamd
#If you can't get away with tar-pitting the jerks just dump'em!
#pass quick from <spamd-white> to port smtp
#block drop from <spamd> to port smtp
/etc/rc.conf.local
# Run spamd in blacklisting mode only.
spamd_flags="-b"
/etc/mail/spamd.conf
all:\
        :spews1:spews2:bsdly:china:korea:spamhausDROP:nixspam:

nixspam:\
        :black:\
        :msg="Your address %A is in the nixspam list\n\
        See http://www.heise.de/ix/nixspam/dnsbl_en/ for details":\
        :method=http:\
        :file=www.openbsd.org/spamd/nixspam.gz:

bsdly:\
        :black:\
        :msg="SPAM. Your address %A is in the BSDLY traplist\n":\
        :method=http:\
        :file=http://www.bsdly.net/~peter/bsdly.net.traplist:

spamhausDROP:\
        :black:\
        :msg="SPAM. Your address %A is in the Spamhaus DROP List\n\
        See http://www.spamhaus.org/sbl and\
        http://www.abuse.net/sbl.phtml?IP=%A for more details":\
        :method=http:\
        :file=www.spamhaus.org/DROP/drop.lasso:

# Mirrored from http://www.spews.org/spews_list_level1.txt
spews1:\
       :black:\
        :msg="SPAM. Your address %A is in the spews level 1 database\n\
        See http://www.spews.org/ask.cgi?x=%A for more details":\
        :method=http:\
        :file=www.openbsd.org/spamd/spews_list_level1.txt.gz:

# Mirrored from http://www.spews.org/spews_list_level2.txt
spews2:\
        :black:\
        :msg="SPAM. Your address %A is in the spews level 2 database\n\
        See http://www.spews.org/ask.cgi?x=%A for more details":\
        :method=http:\
        :file=www.openbsd.org/spamd/spews_list_level2.txt.gz:

# Mirrored from http://www.okean.com/chinacidr.txt
china:\
        :black:\
        :msg="SPAM. Your address %A appears to be from China\n\
        See http://www.okean.com/asianspamblocks.html for more details":\
        :method=http:\
        :file=www.openbsd.org/spamd/chinacidr.txt.gz:

# Mirrored from http://www.okean.com/koreacidr.txt
korea:\
        :black:\
        :msg="SPAM. Your address %A appears to be from Korea\n\
        See http://www.okean.com/asianspamblocks.html for more details":\
        :method=http:\
        :file=www.openbsd.org/spamd/koreacidr.txt.gz:
root's crontab:
0 * * * * /usr/libexec/spamd-setup
This is far better for a few reasons.
  • The periodic updates are done by spamd-setup which is written far better than my script could ever be and is easily appended to without worrying about fat-fingering something (spamd-setup will fail if the configuration is messed up).
  • spamd and spamd-setup are in base and are supported.
  • Aside from spamd's rather weird configuration file (well weird to me) it is far less error prone.
How have you solved this problem politically (or technically) in your environment?
[topicsysadmin]

<< Developer Bio - Bret Lambert | Reply | Flattened | Expanded | Call For Donations - HPPA and VAX machines needed in Sweden >>

Threshold: Help

Related Links
more by sean


  Re: Passive Aggressive Spam Filtering (mod 1/37)
by Andreas Vögele (85.216.54.124) on Fri Jul 17 05:43:58 2009 (GMT)
  You can use the list from dnswl.org to whitelist legitimate mail servers. I also use the data from dnswl.org to build lists for Hotmail, Yahoo and Gmail. I don't reject messages from these networks but since our customers usually don't use these services everything from there gets extra points from SpamAssassin.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 2/32)
by Kb (82.120.241.138) on Fri Jul 17 05:53:38 2009 (GMT)
  Hello,

I think you miss "file=" line from the "spamhaus" section in your spamd.conf example.

As far I recall, spews, sbl and spfilter.openbrl.org datas are unusable since 2006 for some of them.
spews_list_level(1|2).txt mirrors from OpenBSD are juste empty.

Cya

  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 8/34)
by Kurt (68.151.57.38) (kurt@seifried.org) on Fri Jul 17 06:49:36 2009 (GMT)
  Sadly it's this and all the other problems of spam filtering that led me to simply give up and use Google apps (domain email hosting) for my personal stuff. Ity's worrying that the cost of spam filtering is going up (i.e. more CPU time, bandwidth, etc.) and that it doesn't seem to be working, I'm hoping Bob will come up with something else clever to help save the world from spam =).
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 0/34)
by Daniel Gracia (paladdin) (guardame_el_secreto@yahoo.es) on Fri Jul 17 07:26:13 2009 (GMT)
http://www.alpuntodesal.com
  Nice point! Spamd is our friend.

By the way, just one detail to comment: In this days I've seen several scripts making use of wget, lynx and others to recover files from web servers, when it should be noted that ftp itself has support to recover files from http protocol as easy as 'ftp http://myserver.com/myfile.tgz' :)
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 7/33)
by sthen (85.158.44.149) on Fri Jul 17 08:14:12 2009 (GMT)
  An idea came to me after reading this article; another way to handle this would be to pull in a bunch of fairly aggressive blocklists (not worry about false positives), greylist addresses on those lists, and pass the others... Constructing the tables and PF ruleset is left as an exercise to the reader :-)
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod -2/32)
by Nudzo (2002:59ad:5eeb:1:214:4fff:fe0f:e7d4) (ivan.nudzik@gmail.com) on Fri Jul 17 09:34:18 2009 (GMT)
  My experiences in short:
1. it took in common 1-2 day to go through gray list when sender has more mail server... for example gmail. But when it is in white list, everything works fine. Sometimes it is hard to explain it to BFU, cause they about mail the same way as instant messaging. ;-)
2. I've put spamd only to filter mail going from internet. Users has other IP/DNS with the real mailserver, which only accepts mail after authorization, so users can send mail out instantly. OpenBSD queues mails from internet in its sendmail and forward it then to real mailserver with accounts.
3. When other servers going down overloaded by spamassassin, OpenBSD load is under 1. ;-) I've put OpenBSD on LDOM on T1000 SPARC machine to make firewall and spam filtering for Solaris LDOMs. It seems to be very good decision.

I.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  SPF and Greylisting? (mod 5/31)
by David Chisnall (82.7.192.45) on Fri Jul 17 10:26:12 2009 (GMT)
  I know there is some resistance to SPF from spamd, but doesn't it solve exactly this problem? Google uses SPF to advertise their outgoing mail servers. Can't spamd check if the resend is from a machine authorised by the domain owner and treat it as a valid resend if both the original and the reply are from SPF-approved domains? You'd probably want to ignore ?all records when doing this, but anyone who publishes accurate SPF records could then resend from any of their mail servers.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 1/35)
by Cybil Courraud (82.66.245.132) (d@cyb.fr) on Fri Jul 17 13:07:06 2009 (GMT)
  My experience...

1. I was missing bsdly in my conf, thanks for it. Nevertheless, (maybe 'cause I use FBSD, sorry) I couldn't fetch the list. So I replaced :file=http://www.bsdly.net/~peter/bsdly.net.traplist: without 'http://' and it's OK.

2. For SPF, I made a script which fetches "my friends'" mail forwarder IPs and feed my whitelist by domainname. In this list, I met f5.com, fr.ibm.com, bizanga.com, bnpparibas.com, sfr.fr, apple.com etcetera. This kind of use of SPF is only for not delaying (at least for work). Take care of reverse lookup (pf doesn't like unresolved hosts): do reverse lookup hosts with a cronded script before.

3. CIDR is very efficient (even if unfair as we do it for China or Korea). BTW, I add to spamd.conf(5) some lists from my favorites top spam countries (which I'm not communicating with). Here is my script to get a good country ranking (install GeoIP package before):

#! /usr/bin/perl
my %db;
for ( `spamdb` ) {
        next if /^SPAMTRAP/;
        if ( /^(\w+)\|([^\|]+)/ ) {
                my $kind = $1;
                my $ip = $2;
                my $country = `geoiplookup $ip`;
                $country =~ s/.*, ([\w\s]+)\n/$1/;
                if ( $country =~ /IP Address not found/ ) {
                        $db{"$kind not found"} .= "$ip ";
                        next;
                }
                $db{"$kind $country"}++;
        }
}

for ( sort keys %db ) {
        print $_.": ".$db{$_}."\n";
}

4. Honey pot: publish your spammed logins on a poor web page (my catchall gave me some ;). And ... tarpit for pleasure !

  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 1/29)
by guly (88.149.155.77) on Fri Jul 17 17:48:02 2009 (GMT)
  never heard about dnsbl, or rbl? :)
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  pf.os (mod -2/30)
by Anonymous Coward (84.251.129.228) on Fri Jul 17 19:11:04 2009 (GMT)
  Being able to identify and block Windows hosts would increase the ability to block spam. pf.os is a bit out of date and lacks the fingerprints for the latest strains.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]
      Re: pf.os (5/27) by Anonymous Coward on Sun Jul 19 03:08:42 2009 (GMT)
        Re: pf.os (-2/24) by Morten Larsen on Sun Jul 19 10:42:30 2009 (GMT)
        Re: pf.os (7/27) by Anonymous Coward on Sun Jul 19 13:25:27 2009 (GMT)

  Re: Passive Aggressive Spam Filtering (mod 5/33)
by Anonymous Coward (12.158.188.186) on Sat Jul 18 02:34:18 2009 (GMT)
  your shell script should be fixed.. s/<</>>/ and etc.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod -1/35)
by Chris Bennett (chrisbennett) (webmaster@bennettconstruction.us) on Sat Jul 18 23:51:11 2009 (GMT)
www.bennettconstruction.us
  I didn't have bsdly entry. So I added it. It failed complaining about http:. Got rid of that. got endless ftp connection timed out errors from cron.
I had done some manual changes to nixspam, so I thought I'd try manually downloading traplist.
wget -m -nd - failed
lynx -source - failed
Weird part - both work just fine a home!!
wget, lynx, firefox at home work fine.
Not a clue why fails on server. pf.conf problem? My server is blacklisted somehow?
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 1/13)
by mxffiles (218.11.246.179) on Tue Feb 7 07:41:56 2017 (GMT)
  This is a very good post which I really enjoy reading. It is not every day that I have the possibility to see something like this. Software mxf Software mxf converter free download to convert HD camcorder files. ts converter convert ts video files to avi, mp4, wmv, mov mts to avi mp4 mov mkv iMovie, FCP/FCE with mts converter, so to convert mts files for your PC and mobiles. mod converter and convert tod files just free download mod video converter. m2ts
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

  Re: Passive Aggressive Spam Filtering (mod 0/0)
by Sherrif (103.7.78.19) (shardhakapoor@yopmail.com) on Thu Jul 20 06:24:18 2017 (GMT)
  This article is so much interesting in order to make you well. Then there is to pay to get essays written that are more and more hidden in your lives so that we can have the more ideas about the topics.
  [ Show thread ] [ Reply to this comment ] [ Mod Up ] [ Mod Down ]

[ Home | Add Story | Archives | Polls | About ]

Copyright © 2004-2008 Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to April 2nd 2004 as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. Some icons from slashdot.org used with permission from Kathleen. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. Search engine is ht://Dig. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]