OpenBGPd and route filters

Contributed by phessler on 2015-11-06 from the thats-it? dept.

Many moons ago, OpenBGPd was extensively used throughout the networking world as a Route Server. However, over the years, many have stopped using it and have migrated away to other implementations. Recently, I have been getting more involved with the networking community, so I decided to ask "why?"

Almost exclusively, they told me "filter performance."

Many IXP operators made a point to tell me that they liked the syntax and use of OpenBGPd. They were very concerned that there was apparently only a single piece of software that was now being used for Route Servers, and even some of them used multiple versions of it to "satisfy" their desire for divergent implementations. (insert 'eek' here)

So, I decided to look into why the filters were so slow.

Turns out, we were being very naive with the filters and checking every filter against every rule, even when we couldn't match. This is very common in the first implementation of a filter and even showed up in the initial version of PF. Obviously, filter performance in PF is rather critical to having a performant firewall so I looked at its implemetation. Turns out, it was rather simple and straight-forward.

PF's filter performance is known as "skip steps", which exploit the fact that some rules will never match. As an example, if you write a rule like "allow from AS 65111", this can never match if a peer has AS 65333. In such a case, we need to skip all of those rules.

There are two parts to skip-steps. First is when you parse and load the filters, you need to calculate where to skip to, if a group doesn't apply. This happens at load time. The second part is when we are iterating through the filters, is to detect when we get to a rule that can't apply, and skip to the next group. That happens while applying the rules.

Andy from LONAP (the 15th largest IX in the world, according to wikipedia), gave me the last configuration they used with OpenBGPd. This had 314 peers configured, and about 35,000 prefix filters. For additional torture, I added the IRR ruleset for a friend's network, which brought the total prefix rules up to around 600,000 prefixes filters. To this configuration, I sent 65,000 routes.

In OpenBSD 5.8, processing these routes would take approx 35 minutes on my laptop. With the change I committed today (Nov 6), the same processing takes only 30 seconds.

reposted from http://bad.network/openbgpd-and-route-filters.html with permission

CVSROOT:	/cvs
Module name:	src
Changes by:	phessler@cvs.openbsd.org	2015/11/06 09:23:26

Modified files:
	usr.sbin/bgpd  : bgpd.h rde.c rde.h rde_filter.c 

Log message:
Radically improve the performance of bgpd filters.  Based on PF's skip
steps (and uses much of the same code).

In a torture test of ~600k prefix filters and 65k prefixes, convergance
time goes from 35 minutes to 30 seconds.

Many thanks to LONAP for providing a base configuration for torture
testing.

many discussions with claudio@, benno@, sthen@ and the rest of the bgpd crowd

OK sthen@ benno@

(Comments are closed)

Comments

By Anonymous Coward (179.219.207.238) on 2015-11-06 20:36

No words for it, absolutely amazing. Good job Peter!
By Salim Shaw (eshaw) salimshaw@vfemail.net on 2015-11-06 21:27

Amazing indeed and thank you developers for the awesome work.
By Anonymous Coward (81.82.229.158) on 2015-11-06 21:58

This is very cool. There is an Apricot 2013 presentation that points out exactly this as a the reason why Equinix was moving away. You might want to ask them if they are interested to help test this too (the presenter's email address is in the presentation).
Comments
1. By phessler (phessler) on 2015-11-06 22:50 http://www.openbsdfoundation.org/donations.html
  
  > This is very cool. There is an Apricot 2013 presentation that points out exactly this as a the reason why Equinix was moving away. You might want to ask them if they are interested to help test this too (the presenter's email address is in the presentation).
  
  Thanks! I'll have to look for this presentation and contact them.
2. By gregbo (198.144.201.12) gds@gds.best.vwh.net on 2016-08-16 01:38 http://gregbo.livejournal.com/
  
  Also see the Equinix presentation on route servers from NANOG 51.
By Stuart Henderson (82.68.199.130) on 2015-11-06 22:13 https://dmtx.uk/
"Turns out, it was rather simple and straight-forward." - Henning pointed out a few times that it was only ever meant to be a quick first implementation, to be replaced with something better! Such a nice idea to use skip-steps here, with hindsight it's blindingly obvious, but that's often the way.
Since skip-steps come from PF, we can look in that direction for guidance about performance. An old Undeadly article about PF rule optimization talks about this ("Ordering rulesets to maximize skip steps") and the principles are the same here. You want to group the rules together to reduce the number of evaluations done.
Take these rules, for example:
```
    allow quick from AS 112 prefix { 192.31.196.0/24, 192.175.48.0/24, 2001:4:112::/48, 2620:4f:8000::/48 }
    allow quick to AS 112 prefix { $MY_PREFIXES }

    allow quick from AS 2818 prefix { 212.58.224.0/19, 212.58.224.0/20, 212.58.240.0/20, [...] }
    allow quick to AS 2818 prefix { $MY_PREFIXES }

    allow quick from AS 8330 prefix { 199.212.92.0/23, 199.212.90.0/23, 199.80.128.0/17 [...] }
    allow quick to AS 8330 prefix { $MY_PREFIXES }
```
And here are the skip-step groupings used in the code:
```
    #define RDE_FILTER_SKIP_DIR            0
    #define RDE_FILTER_SKIP_GROUPID        1
    #define RDE_FILTER_SKIP_REMOTE_AS      2
    #define RDE_FILTER_SKIP_PEERID         3
```
You can probably make a good guess at the meanings. So the rules above are not optimal - a simple reordering as below would put all of the "from" rules together and allow half of the lines to be skipped in one go. (Fewer than half of the *rules*, because the prefix lists are expanded to multiple rules, but hey there needs to be some room for borrowing something else from PF for future improvement ;-)
```
    allow quick from AS 112 prefix { 192.31.196.0/24, 192.175.48.0/24, 2001:4:112::/48, 2620:4f:8000::/48 }
    allow quick from AS 2818 prefix { 212.58.224.0/19, 212.58.224.0/20, 212.58.240.0/20, [...] }
    allow quick from AS 8330 prefix { 199.212.92.0/23, 199.212.90.0/23, 199.80.128.0/17 [...] }

    allow quick to AS 112 prefix { $MY_PREFIXES }
    allow quick to AS 2818 prefix { $MY_PREFIXES }
    allow quick to AS 8330 prefix { $MY_PREFIXES }
```
Nowadays PF has a ruleset optimizer that is aware of skip-steps and will reorder rules to take advantage of them automatically. (In fact, in some cases with PF it can still be better to disable the optimizer and do this by hand; you might know something about your traffic that pfctl doesn't). In the case of IXP route-server BGP filters, they're likely to be much larger than any PF ruleset you're ever going to see, so a ruleset optimizer would be relatively slow and take a lot of memory. Fortunately with BGP filters, they're already going to be machine-generated from routing registries or local databases, so it's a lot easier to order them correctly from the start.
By Blake (2a01:e34:ec06:8f90:cabc:c8ff:fedb:4d83) on 2015-11-07 23:51 2112.net

Danke Schoen Peter! The result is impressive. You may wish to talk to Job Snijders of NTT & Cololue if you haven't already. He's developed a pile of very useful IRR filter validation tools & BGP feeds to test them on:
https://github.com/job
http://irrexplorer.nlnog.net/search/AS22512
Comments
1. By phessler (phessler) on 2015-11-09 11:01 http://www.openbsdfoundation.org/donations.html
  
  > Danke Schoen Peter! The result is impressive. You may wish to talk to Job Snijders of NTT & Cololue if you haven't already. He's developed a pile of very useful IRR filter validation tools & BGP feeds to test them on:
  > https://github.com/job
  > http://irrexplorer.nlnog.net/search/AS22512
  >
  >
  
  Thanks! Yes, I have been in contact with Job already, we talked about some features he would like to see in the future.
By Ilyas Bakirov (82.200.241.50) on 2015-11-09 09:48

As far as I know, there is no OpenBGPD portable version :(

Would be nice if all latest updates of OpenBSD's OpenBGPD got by others.
Comments
1. By phessler (phessler) on 2015-11-09 10:38 http://www.openbsdfoundation.org/donations.html
  
  > As far as I know, there is no OpenBGPD portable version :(
  >
  > Would be nice if all latest updates of OpenBSD's OpenBGPD got by others.
  
  The last release of OpenBGPd portable was done from 4.6, 6 years ago. I plan on updating that in the near-future, but it will take some time.
  Comments
  1. By Ilyas Bakirov (147.30.27.64) on 2015-11-09 12:37
    
    > The last release of OpenBGPd portable was done from 4.6, 6 years ago. >I plan on updating that in the near-future, but it will take some time.
    Yeah, I know.
    
    Good news. Thanks. Public code like GitHub are welcome for -p(portable) version and porting will faster by many contributions and donations...

Latest Articles

Sat, Jun 28
- 05:57 Game of Trees 0.115 released (0)
Tue, Jun 24
- 07:48 Game of Trees 0.114 released (0)
- 07:23 Call for testing: bge/bnx/iavf/igc/ix/ixl/ngbe/pcn: ifq_restart() fix (0)
Mon, Jun 16
- 08:22 j2k25 hackathon report from kn@: installer, low battery, and more (0)
Fri, Jun 13
- 11:18 dhcpd(8): use UDP sockets instead of BPF (1)
Thu, Jun 12
- 12:32 clang(1)/llvm/lld(1) updated to version 19 (0)
Wed, Jun 11
- 12:22 Source code sandboxing (0)
Tue, Jun 10
- 06:50 TearFree option backported to modesetting(4) driver (0)
Mon, Jun 09
- 07:32 FFS optimizations with dirhash, as blogged by rsadowski@ (1)

Credits

Copyright © 2004-2008 Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to April 2nd 2004 as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]