OpenBSD Journal

active-active pf with pfsync

Contributed by dlg on from the dept.

A long time ago I started working toward supporting active-active stateful firewalls. I was fortunate enough to be able to do it as part of my studies, which had the unfortunate side effect that I had to write a paper about it rather than actually fixing the bugs in the code. However, I'm happy to say that I finally got it all working, ran it in production, and even committed to the tree.

For those of you who remember the paper I wrote, there were two outstanding problems with the code which stopped it working.

The first problem was the deferral of the first packets in a state was always hitting the timeout before sending the packet on rather than getting an acknowledgement from a peer. The reason for this turned out to be pretty easy to fix. pf calls into pfsync twice when a state is created, once when the packet is inserted into the state tree, and again soon after to see if the packet should be deferred or not. I originally thought these two calls were made in the reverse order (check for defer before handling the insert), which meant my handling of the state was incorrect. As well as fixing the code to handle the correct order of operations, I also added a knob so the packet deferral by pfsync can be turned on and off. By default it is off.

The second problem was a lot harder to handle. When pfsync gets an update from it's peers, it has to merge their details into the local state tree and figure out if there are some changes made to the local tree that the peers need to know about for that state. This code made my head hurt, but eventually through some guesswork and a lot of testing I think I've got it right.

So pfsync now works if you run traffic over both legs of your firewalls. I'm doing this on my firewalls at work, and it works surprisingly well.

In my setup I have 30 vlans trunked over a single em(4) controller in each firewall. 29 of these vlans are considered internal networks. Routing for these internal networks is provided by carp interfaces on the firewalls. At the moment carp is set up so only one of the firewalls is the master on any particular vlan. The 30th vlan is the network I talk to the upstream provider on. We use OSPF on that interface to advertise the networks I host and for us to learn the default route and so on from our provider. ospfd is configured to only announce the networks on carp interfaces that are the master.

Because ospfd only announces the networks that the particular firewall is a carp master for, traffic in and out of the internal network tends to flow in and out over the same firewall. This helps localize the state updates for a significant portion of our traffic, therefore reducing the need for pfsync to exchange information for those particular flows. In fact, if my networks only ever talked to hosts via the upstream routers, the previous version of pfsync would have worked fine for me simply because the state updates from actual traffic was always on the same firewall.

However, the new pfsync code is necessary when my internal networks talk to each other. The problem occurs if I have two vlans, eg, vlan1 and vlan2, and the carp master for each of these networks is on different firewalls. Let's call them fw1 and fw2 for the sake of this disussion. If the first firewall is the master for vlan1, traffic from vlan1 to vlan2 will flow into fw1 and out of it again onto vlan2, but the replies from vlan2 will come into fw2 and out of it again. This is the split brain setup that pfsync previously could not cope with. In this situation pfsync will now detect that the traffic is flowing over both firewalls for this one state and will start to exchange updates more rapidly for it.

There are some limits with how fast the traffic in a split brain setup will move because of how pfsync traffic is mitigated now. The TCP windows in a state will only progress as fast as pfsync will exchange updates between your peers, which in turn limits how fast TCP can ramp up to. In my particular environment this hasn't been a problem though. We just don't do enough high speed TCP transfers to be affected by this.

Anyway, to prove that I am doing active-active now, here are some graphs showing the traffic seen on the switch ports my 30 vlans are trunked over. The graph starts with the firewalls in active-passive. See if you can pick when I switched the master role, then switched to active-active and then chickened out. I manned up again shortly after though and it's been running active-active since then.

These changes are in current now, and hopefully in snapshots too. I'm extremely keen for people to try them out (don't forget to go ifconfig pfsync0 defer) and see how their setups behave. I'd love to see what interaction it has with carp load balancing, or setups with routing protocols and multiple routes. I'd especially love to know what performance limits people hit with active-active too.

Again, thanks must go to Ryan McBride for helping me figure this stuff out, and to Stuart Henderson for testing my changes.

(Comments are closed)


Comments
  1. By Anonymous Coward (88.97.233.154) on

    Brilliant news. Thank you very much.

  2. By Anonymous Coward (200.152.43.195) on

    Just curious... those graphs were made by which software?

    Comments
    1. By Anonymous Coward (212.11.9.139) on

      > Just curious... those graphs were made by which software?

      rrdtool - that's what it says right next to the graphics themselves...

      Comments
      1. By Peter van Oord van der Vlies (2001:888:102e:0:216:eaff:feb3:de6a) zork@cgg.nu on

        > > Just curious... those graphs were made by which software?
        >
        > rrdtool - that's what it says right next to the graphics themselves...

        I think it is mrtg.

        Comments
        1. By Lennie (82.75.64.11) on

          > > > Just curious... those graphs were made by which software?
          > >
          > > rrdtool - that's what it says right next to the graphics themselves...
          >
          > I think it is mrtg.

          modern (like for 5 years or so ? ) mrtg uses rrdtool as backend, but mrtg isn't the only one, cacti for example uses it as wel.

          Comments
          1. By Lennie (82.75.64.11) on

            > > > > Just curious... those graphs were made by which software?
            > > >
            > > > rrdtool - that's what it says right next to the graphics themselves...
            > >
            > > I think it is mrtg.
            >
            > modern (like for 5 years or so ? ) mrtg uses rrdtool as backend, but mrtg isn't the only one, cacti for example uses it as wel.

            Just remembered, it could also be symon, which produces rrd-files.

            Comments
            1. By sthen (85.158.45.32) on

              > > > > > Just curious... those graphs were made by which software?
              > > > >
              > > > > rrdtool - that's what it says right next to the graphics themselves...
              > > >
              > > > I think it is mrtg.
              > >
              > > modern (like for 5 years or so ? ) mrtg uses rrdtool as backend, but mrtg isn't the only one, cacti for example uses it as wel.
              >
              > Just remembered, it could also be symon, which produces rrd-files.

              The actual _graphs_ are made by rrdtool, under control of some other software. I think this may be Cacti but I'm not certain. There are a couple of options in and out of the ports tree.

              In my own testing I've been using symon to send stats out from my firewalls, symux to record them, and syweb to generate graphs (with a custom config showing pfsync traffic and traffic on the vlandev interface stacked up on the same graph).

              symon is nice as it's geared up to fast sampling (e.g. the default is one sample per 5 seconds) with fewer overheads than SNMP and can record mbuf use, information about the pf state table, sensors (temperature etc) and queues, which are at best fiddly to obtain via SNMP.

  3. By Anonymous Coward (67.69.227.99) on

    Is it possible yet to do DHCP on a CARP interface or anyone working on this? I do similar, but had to script it...

    Comments
    1. By sthen (85.158.45.32) on

      > Is it possible yet to do DHCP on a CARP interface or anyone working on this? I do similar, but had to script it...

      I used to script it, but then beck@ added a useful little sync protocol to dhcpd, so now I just run it on both firewalls all the time (on the interface that carp runs over, not on the carp interface itself).

      (hint: if you want to use the multicast version of the sync protocol, set MULTICAST_HOST=Yes in rc.conf, and if you don't want to reboot to activate this you can "route delete 224/4").

  4. By Anonymous Coward (93.36.116.131) on

    So, it will works from 4.6?

    I tried it on 4.5, both amd64 and i386 architecture and I was not able to make it works.

    Comments
    1. By sthen (2a01:348:108:100:230:18ff:fea0:6af6) on

      > So, it will works from 4.6?
      >
      > I tried it on 4.5, both amd64 and i386 architecture and I was not able to make it works.

      It already works in -current. Just set the "defer" flag on the pfsync interface.

    2. By David Gwynne (dlg) on

      > So, it will works from 4.6?

      yes, but to make sure of that you should definitely try -current as sthen suggests.

      > I tried it on 4.5, both amd64 and i386 architecture and I was not able to make it works.

      one of my firewalls is i386 and the other is amd64, and they seem to talk fine to each other.

      Comments
      1. By Anonymous Coward (93.36.132.7) on

        > > So, it will works from 4.6?
        >
        > yes, but to make sure of that you should definitely try -current as sthen suggests.

        I will try, even if I have to wait -stable for the production environment.


        > > I tried it on 4.5, both amd64 and i386 architecture and I was not able to make it works.
        >
        > one of my firewalls is i386 and the other is amd64, and they seem to talk fine to each other.

        Yes, with 4.5 pfsync works perfectly, but if you try to set up an active/active configuration, the traffic flows only through a firewall.

        Jason Dixon opened a bug about that:

        http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yes&numbers=6084

        Comments
        1. By Anonymous Coward (2a01:348:108:100:230:18ff:fea0:6af6) on

          > Yes, with 4.5 pfsync works perfectly, but if you try to set up an active/active configuration, the traffic flows only through a firewall.
          >
          > Jason Dixon opened a bug about that:
          >
          > http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yes&numbers=6084

          That's a different problem.

  5. By Jean Aumont (70.81.56.237) jaumont@mediagrif.com on

    First, thanks for the work and effort.

    I would like to know how an ftp session would work with this active-active configuraton.

    In a Master-Backup configuration, you have an ftp-proxy deamon that run both the Master and the Backup firewall. The ftp-proxy deamon add dynamically rule to an anchor, and the rule lets the ftp session pass through the firewall.

    In the Master-Backup configuration, when the ftp command mget is runnning and let say you are getting 5 files, and if at the third file, the role are switched between the firewall, the transfert of the third file will complete, but file 4 and 5 will never transfer. This is a bug that had never been fixed ... and I am not sure if it is created by the state not replicated properly by pfsync or the fact that the dynamic rules are not set on the backup firewall ...

    What will happend with the active-active configuration regarding ftp session and the ftp-proxy deamon ???

    Thanks,

    Jean Aumont

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]