Contributed by pitrh on from the my admin told me not to talk to strangers dept.
I have been using gray-listing to thwart spamming for what feels like a very long time. I started using it around the release of OpenBSD 3.5. It was an amazing change from a constant storm of spam and just enabling it got rid of 80% of the spam almost immediately. That amazing improvement didn't come without a cost. Some mail services and servers don't work so well with it. Especially large mailing systems that pass around messages and don't necessarily guarantee the next delivery attempt will come from the same IP or network. Microsoft Exchange was also known to be 'usually' configured in such a way to not work with gray-listing as well.
Over the years I've either tolerated or white-listed IPs where I got particular complaints though the really hard nuts to crack are the larger organizations with 'big mail' infrastructure like Google Mail, Hotmail, Yahoo etc. White-listing huge chunks of frequently moving address space really eroded the benefits of gray-listing. I think for at least two years we turned gray listing on and off at a previous employer to 'work around' complaints. On my personal systems I've just left it on and didn't care if it blocked legitimate mail (people really wanting to contact me would know how to get a hold of me) and would just disable PF on-demand for account setups/password reset pages. For the rather annoying mail systems I'd trawl the ARIN database white-listing every network related to a particular company and called it day. A rather manual and time consuming process. It was to the point that I considered automating the temporary disabling of PF (i.e. spamd) for users.
Fast forward to recently where time is at a premium and the address ranges and outgoing MX delegates for big services are changing frequently. I figure I could spend full-time just looking for ranges to white-list. I just didn't have time for that. Then I stumbled upon this thing from Yahoo called SPF. SPF is a mechanism for domains to announce its mail exchangers via DNS TXT records. SPF is typically used as a way to validate incoming mail headers/connections against the published SPF records during delivery conversations (namely SMTP).
The mail daemon I use personally Qmail (slowly being phased out by OpenSMTPD) doesn't support SPF (or pretty much anything other than base RFC SMTP) particularly well. My way around that was to write a shell script to grab the SPF records for a well known set of domains and put them into the white-list (run once a day).
This worked great for a year or so. About a month ago a few of my friends mentioned they couldn't get an email to me from gmail. Having thought I solved those particular problems years ago I manually refreshed the white-list and it didn't work. I then looked at the SPF record Google was publishing and noticed that instead of listing the networks directly they included a recursive record which my dumb and simple script didn't handle (note I just expected A, MX and NET SPF tokens).
I figured with most of my friends and family using gmail (and similar), I had to support SPF a bit better than high-school parsing of basic records.
To that end I wrote an SPF parser in python which handles the recursive records and does a much better job of understanding the SPF records.
The script is written in python because Bourne style scripting just wasn't expressive enough and I enjoy the language. The one thing I didn't like about python was the lack of a 'good' simple (and tiny) DNS library. There are definitely a few out there but it isn't something I would want to ship with a product. My idea of simple is to include the one or three modules in the same place as my main script and call it a day. The available python DNS libraries were far more complex which turned me off. To get around that (and because 'computers are fast and memory is cheap') I decided to just pipe out to using dig (like my original shell script). It is definitely not as efficient as processing DNS calls natively in the tool but it isn't so expensive to be a problem.
The best part is (aside from python) it all ships by default with OpenBSD so no extra packages or similar nonsense.
You can find a copy of the script here:
It isn't the most elegant or perfect of solutions (I found some domains resolve SPF A records with DNS CNAMES which I don't try to further resolve) but it does the job and fixed a few other domains/services as well (twitter's website doesn't continually bother me about updating my email with them anymore). Another issue I should fix is putting a depth limit on the recursion. Because I'm now recursing on all domains that are included there is a possibility of a circular reference but I've not seen that yet so I've not put that in.
In the github repository I've also put a sample of the script I'm using to call this tool and populate a white-list:
Using gray-listing is just the top-most layer of the spam filtering (also using rblsmtpd with Qmail and spamassassin called from procmail) and it does the job admirably passing the results onto the layers below. On my backup exchanger I'm using OpenSMTPD so I just sync the white-list every generation to the backup MX (via scp) and populate the PF table accordingly (CPU resources on my backup MX is too slow to run it from there) which keeps the primary and backup exchangers in sync at that particular layer.
Regardless I put the code up on github (this was an excuse to try that service out) in case anyone else wants to use or improve upon this odd use of SPF. It would be great to get text versions (i.e. PF friendly) of the more reputable RBL lists though that is either cost prohibitive or not available. If you are not using Peter's block list (http://www.bsdly.net/~peter/traplist.shtml, also see this earlier story)... you definitely should. Another good list is the OpenBL list (previously known as sshbl), http://www.openbl.org/. Originally the server they hosted their list on wasn't too reliable but it has been getting better.
(Comments are closed)