OpenBSD Journal

spamd statistics..

Contributed by mbalmer on from the bob's-hammer-hammers-spammers dept.

Bob Beck reports some quite interesting spam statistics he gathered at UofA, using his spamd, of course:


Since somebody asked me (I often don't look because I seem
to think spam is lessening - really it's not.... :)

in the past *36 hours*...

smtp.srv.ualberta.ca (the mail host for @ualberta.ca) has recieved:

212598 real smtp connections to it (roughly real pieces of email)
coming from 29217 distinct hosts - these would be from hosts
in our whitelist - currently 138793 hosts, which are the hosts we
have exchanged mail with legitimately (inbound or outbound) in the
last 30 days.

During that time, there have been:

696269 connections to the spamd greylister in front of it (roughly
speaking, all of that is junk) coming from 177359 distinct hosts.
during this time, of all those connections and all those hosts, 3229
hosts retried according to spec and were whitelisted and allowed
through (all the rest never tried again :) in other words for all
those connections there were 3229 hosts added to the 30 day whitelist
above.

So, currently, our volume of what is assuredly junk to that of what
might be real mail is roughly a little less than 7 junk to 2 good :)
and of the hosts hitting the greylister it's roughly a 54 to one ratio
of junk (i.e. a virus infected machine) to *possibly* good. - needless
to say this makes a significant impact on the capacity of the mail system :)

Now some details:

Of those 696269 connections:

270119 of them disconnected in under 10 seconds, which means they
never attempted to deliver mail to us - because they were spam software that
thought we were tarpitting them. - we talk slowly to hosts on the greylist for
the first 10 seconds of a connection, because real software doesn't care, spam
generating robots do, and attempt to time out quickly, so we use this
against them, to make them go bother someone else.

174219 of them were from (20305 distinct) hosts that connected after
hitting a spamtrap address and having future mail from the site
delayed for 24 hours. This is due to them mailing nonexistant or 10 year
old addresses from a site that has never exchanged mail with us before
(this is referred to as a "greytrap")

The 3229 hosts whitelisted above above would have come from the
remaining 696269 - 270119 - 174219 = 251931 connections which actually
had a chance to get through.

Anyway, thought some of you might enjoy those stats.

-Bob

(Comments are closed)


Comments
  1. By venture (217.22.88.123) venture37 # hotmail com on www.geeklan.co.uk

    neat!!!
    whats the spec of smtp.srv.ualberta.ca ???

    Comments
    1. By Bob Beck (68.148.128.240) beck@ualberta.ca on


      the spamd box is a Dell PE 650 with 256 MB of ram.

      what's that, something like a gig and a bit pentium 4?

      It doesn't really breathe very hard doing that.

      The rest of the mail cluster is more extensive. see my talk from
      http://www.openbsd.org/papers/bsdcan05-spamd/ which outlines what
      we use.

  2. By Jim (68.250.26.213) on

    Are you using the standard spamd.conf included with OpenBSD? Just curious about the potential configuration delta between UoA's config and my personal config. BTW, the only change I've made was to add my own blacklist as generated from relaydb.

    Comments
    1. By Bob Beck (68.148.128.240) beck@ualberta.ca on


      Actually my config is weaker than standard - we use no external blacklists because of the possibility of throwing out the baby with the
      bathwater - too many blacklist maintainers just throw everybody in there.

      The only thing we do is greylisting, and an extensive list of greytraps
      we use no blacklists.


  3. By Anonymous Coward (128.171.90.200) on

    Wheres the piechart ?

    ;OP

    Comments
    1. By Anonymous Coward (70.157.197.206) on

      Here's a pie chart, as requested.

      Comments
      1. By Anonymous Coward (24.84.108.32) on

        Sad to see Pecan so under-represented. Shame that more people haven't discovered this wonderful source of pure sugar. :)

  4. By Erik Carlseen (68.6.193.220) on

    Some less-interesting and even less-explained (because they're combined) stats from a group of domains I manage:

    6,673,000 e-mails rejected as spam
    88,000 viruses blocked
    91,000 e-mails tagged as possible spam and passed through
    1,269,000 e-mails presumed valid

    This gives a spam:valid ratio (excluding viruses and tagged messages) of roughly 5.25:1.

    There is one very interesting statistic: nearly all of our valid e-mails come through on the first mx servers listed for the domains. For the secondary, tertiary, etc. mx servers the spam:valid ratio is roughly 60:1. Yes, this has been manually confirmed - at first we thought the spam filters on those machines were malfunctioning. I'd be very interested in hearing if other people have experienced something similar.

    Comments
    1. By Otto Moerbeek (213.84.84.111) otto@drijf.net on

      Yes, I see this all the time. Some random thoughts why spammers would do this:

      They assume that secondary mx servers have no way to tell if a To: address is valid, and thus have to accept all, and they do not care if their stuff actually gets delivered to a recipient; they just want to dump as much mails on a server as possible. If the primary uses greylisting, but the secondaries not, it's also effective to use them instead of the primary.

      It could also be that the secondary mx servers typically have a lower load, so they can accept incoming mail faster. Now this is probably a self-defeating strategy.

      Comments
      1. By Erik Carlseen (68.6.193.220) on

        I hope your theories are valid, because our filters share information, have LDAP lookup capability, and rate-limit inbound connections. So they can take their ideas and shove them in a ... errr... black hole.

    2. By Ben (208.27.203.127) mouring@nospam.eviladmin.org on http://eviladmin.org

      Same here.. Started seeing it about 4 months ago. They are not even attempting the primary, and just randomly picking higher MX value.

      Granted.. 90% of all my spam comes in from my backup. However that is still a lot less spam that I use to see.

      Like with Bob. I don't rely on massive prebuilt blacklists. But I do blacklist the major offenders that are smart enough to use real mailservers instead of dumb mass mailers.

      Sadly, I wish I could ban all of hotmail.com since their spam reporting tool sucks, and for a while I was just getting massive floods from them (and yes I verified it was truly from them).

      - Ben

    3. By Anonymous Coward (63.217.255.130) on http://alec.restontech.com/

      Here are our combined stats for one day, for comparison purposes:

      ----------------------------------
      737 Mail from authenticated user
      4926 Other (usually bounces for unknown)
      12895 IP of sender on blacklist
      1175 Message relayed
      653 Invalid sender
      2601 Message sent successfully
      5552 Message identified as spam
      51 Message contained virus
      ----------------------------------
      Rejected: 24077 ( 86%)
      OK/Relay: 3776 ( 14%)
      ----------------------------------
      Total in: 27853
      out: 737

      Note-- we use Spamassassin, ClamAV, a few blacklists, and some tricks I added to sendmail.mc

    4. By Waldo Nova (216.240.2.95) on

      Many secondary MX servers do not have spam filtering!

      Many small businesses will have a mail server and some spam filtering in front of it. They will get the ISP's mail server setup as a secondary to cover off down times. The ISP will most likely not filter spam or will charge a premium for it.

      Comments
      1. By Brad (216.138.195.228) brad at comstyle dot com on

        That is exactly the point. That is a bad e-mail setup and that is what spammers are looking for.

    5. By Anonymous Coward (24.217.190.176) on

      I read an article in Sys Admin magazine (June 2005 Spam Supplement -- http://www.samag.com/articles/2005/0513/ -- but the article isn't online) about publishing low-priority MX hosts that alway reply with a 451 try-again-later. It made perfect sense.

      For my domains I have two valid mail servers: 5 - primary mail, 10 - backup - has tighter spam restrictions, 500 - spamd, 1000 - spamd.

      My /etc/spamd.conf consists of

      all:\
      ::

      The 500 and 1000 MX records should never receive any legitimate mail. On the very off-chance that both of the legitimate mail servers are down, valid incoming mail would just be directed to try again later anyway. Guaranteed zero false positives.

      One spamd host has a one-second pause between character responses, and the other has no lag. I would be curious to know how the connection could be stuttered at the DATA stage, since I get a lot of 3-second disconnects on the slow host (but it's pure joy to watch 'connected for 649 seconds' in the logfiles). Has anyone noted _where_ in the message transfer stage that the spam software disconnects? Is it normal to allot a time for the entire message delivery, or to allot so much time to various stages of the message (connect, envelope, message, etc.)?

      I don't want the spammer to move along to somewhere else, because that 'somewhere else' just might be my valid servers -- I would prefer to keep them tied up as long as possible. That also gives more time for the primary servers to get updated info (RBL's and Brightmail [highly recommended]). In fact, I've considered outright blocking, at the firewall, connections from anyone connecting to the spamd boxes. For an hour or so anyway.....

      And did I mentioned what an absolute joy it is to watch the logfiles?

      Comments
      1. By Erik Carlseen (68.6.193.220) on

        Extremely interesting idea. It would also be interesting to harvest the IPs from those process logs and use them for building blacklists (or greylists). You'd have to filter them against the availability of your "real" hosts, but that's just a few lines or Perl/Python/Ruby/whatever.

        Comments
        1. By Anonymous Coward (24.217.190.176) on

          Use a machine that doesn't accept any incomding SMTP (mine are on DNS servers), and do something akin to:

          /etc/pf.conf
          rdr pass on $ext_if proto tcp from any to port smtp -> 127.0.0.1 port spamd

          /etc/rc.conf.local
          spamd_flags="-b127.0.0.1 -v -nBob"

          /etc/spamd.conf
          all:\
          ::

          When it's working, add it as your lowest priority MX.


          I reckon I should figure out how to change the default messages, but the default connection looks like:

          >telnet my.server.com 25
          Connected to my.server.com.
          Escape character is '^]'.
          220 my.server.com ESMTP Bob; Sat Feb 18 19:48:47 2006
          helo me.here.com
          250 Hello, spam sender. Pleased to be wasting your time.
          mail from:<me@you.com>
          250 You are about to try to deliver spam. Your time will be spent, for nothing.
          rcpt to:<yo@mama.com>
          250 This is hurting you more than it is hurting me.
          data
          354 Enter spam, end with "." on a line by itself
          waste waste waste waste waste
          .
          451 Temporary failure, please try again later.

          >tail -f /var/log/daemon | grep "spamd"

          Feb 18 19:15:12 spampit spamd[5410]: 194.90.246.244: disconnected after 351 seconds.

          ... and so on.

          Comments
          1. By Anonymous Coward (24.84.108.32) on

            What a great idea... I'll have to try that one out.

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]