OpenBSD Journal

Update to: Henning Brauer at NANOG 36 - OMG how did he get in!

Contributed by Jason Houx on from the bikini-party dept.

As posted earlier on undeadly Henning Brauer was recently at Nanog 36 and spoke about OpenBGPD. The Real Media Video is now available. This really made my day to be able to watch this. Thanks Henning and Nanog!

(Comments are closed)


Comments
  1. By Nate (65.94.100.49) on

    A video is nice and all, but real media? I mean, ew man, ew.

    Comments
    1. By Anonymous Coward (196.192.96.28) on

      Is this just me or the framerate of the video just very bad.

      Comments
      1. By Anonymous Coward (199.212.90.20) on

        Dude, it's NANOG. Of course the video is bad.

        Comments
        1. By Anonymous Coward (196.192.96.28) on

          Hey Dude. Thanks

        2. By Anonymous Coward (71.243.45.163) on

          The audio is also kind of bad. Somehow, an abnormal amount of breathing was captured in the audio.

      2. By petard (66.93.101.100) on

        The framerate isn't actually that bad... I think the server's a bit slow though. Just use mplayer to dump the stream, give that a a little bit to build up a good cache, then play the dump.

        Comments
        1. By Anonymous Coward (196.192.96.28) on

          Good advice. Thank you

      3. By Anonymous Coward (206.132.94.6) on

        No, the video quality is truly awful.

        The choice of real media is also boggling. But more than anything, better quality video would be nice. A group of network professionals should be able to see that with Internet speeds increasing for the average user that you can make the recording larger and it will still be accessible to the majority of users.

  2. By Anonymous Coward (205.239.196.6) on

    Is it just the firewall here, or is the file password protected. I'm being asked for authentication info.

  3. By Anonymous Coward (134.58.253.131) on

    How does one play this video on OpenBSD/i386? Or do I need to go and hunt for a Windows machine? Mplayer (with win32codecs) doesn't seem to work.

    Comments
    1. By nathan (69.69.143.46) nathan@brainwerk.org on

      works fine for me... what are you trying to play? mplayer doesnt seem to play .ram "playlists" on its own, you might have to give it the path to the actual video file:

      mplayer `wget -O - http://www.nanog.org/mtg-0602/real/openbgpd.ram`

      I would recommend doing what petard (i believe thats who it was) said to do and dump it before attempting to play...

      mplayer -dumpstream -dumpfile openbgpd.rm `wget -O - http://www.nanog.org/mtg-0602/real/openbgpd.ram`

      then playing the rm...

      i guess you could also use

      mplayer -playlist http://www.nanog.org/mtg-0602/real/openbgpd.ram instead of wget...

      Comments
      1. By David T. H. (67.35.151.49) on

        I had the same problem. If you do a "pkg_info mplayer" it should tell you to make certain you have sysctl machdep.userldt=1 set. This setting can be found in /etc/sysctl.conf A copy of the pkg_info mplayer for 3.8 can be found here: http://www.openbsd.org/3.8_packages/i386/mplayer-1.0pre7p5-no_x11.tgz-long.html Information on why this is probably not 1 by default: http://www.securityfocus.com/bid/2739/discuss Good luck!

  4. By Jamyn (24.27.87.53) anonymous@wiretapped.us on http://www.wiretapped.us/

    This is a rough transcript of Henning's speech. It is not word for word, but I think it is pretty close. Sometimes his accent is a little heavy, so I apologize if I mistyped anything. My copy of the Real Media stream had horrible picture quality, so sadly, I really cannot use that as a reference.

    By the way - does anyone else find it ironic that a presentation by an OpenBSD and OpenBGP developer was distributed using the proprietary Real Media format? heheheeh. Anyway on to Henning's excellent presentation.

    Note that I have only copied the first 18 minutes from the 35 minute presentation. I need a break. :)

    Also note, this is cut into two comments, as it wont all fit into one.

    BGP: Why another implementation?
    ================================
    I started OpenBGP two years ago, after getting completely fed up with Zebra, which we were running before; there were lots of bugs, bad configuration language, and since I don't speak Japanese, I had problems understanding the documentation.

    Zebra makes heavy use of cooperative threads, which leads to it's main problem: Combined with the central event queues, Zebra can lose sessions while busy. This is because the keepalive events are way down in the queue, so if something else simultaneously consumes all the CPU power, Zebra just doesn't process the keepalives until the peer resets the session.

    Zebra successor, Quagga, caught up, and apparently fixed many of the bugs - but they still used the Zebra's design, which I think is wrong. So, they are kind of unfixable.

    Designing our BGP Daemon:
    =========================
    Turning a generic unix machine into a bgp router requires way more than just adding a userland bgp speaking process. We want 3 processes: a session engine (SE) that just manages BPG sessions, a route decision engine (RDE) that holds the BGP tables and makes routing decisions for best path selection, and a parent process that enters routes into the kernel and starts the SE and RDE.

    BGPd Session Engine:
    ===================
    The BGPd session engine maintains tcp sessions to BGP neighbors and control sessions to the bgpctl utility. Once a session is established, it's responsible for sending the keepalives out, and processing the keepalives from neighbors, but does not deal with routes at all. Update messages are passed on to the RDE. It's very lightweight - typically under 1MB of RAM on i386. Sometimes you'll see it getting bigger, but it only does that when it has to buffer a lot of events for a slow neighbor. It runs as an unprivileged user, and chroots to /var/empty, which is empty except for logging sockets.

    Route Decision Engine
    =====================
    The Route Decision Engine maintains the routing information base (RIB). The BGP filters run there. It calculates the best possible path per prefix, and generates the UPDATE messages as needed.

    The routing information base (RIB) itself is split into many tables that are heavily crosslinked. The goal was to avoid table walks. In fact, we almost never need table walks - we need table walks of course when a new peer comes up (where we need to send them the entire table), and when the BGP control utility requires display of the entire table. Otherwise, there is no table walks.

    It's very memory efficient - these numbers are from before we had soft reconfig enabled, but it didn't get that much worse, and you can still turn soft reconfig off. With that, one full view needs around 20MB of ram on i386, and two full views just need 25MB of ram. This is not even half of what you need with other implimentations. It's very fast - it takes about 10 seconds to load a full view on a 1ghz P3, and less than 5 seconds to dump a full view to your peer. Just like the session engine, it runs as an unprivileged user, and changes to /var/empty as well.

    The BGPd Decision Process
    =========================
    1) check if the prefix is reachable at all.
    2) Check the local preference (bigger is better).
    3) Check the AS path length (shorter is better).
    4) Check origin.
    5) [SOMETHING] exit discriminator (which is only comparable between the same neighboring AS, but we can always compare if you need that)
    6) BGPd is cooler and is one of our extensions
    7) Weight - which is used to force traffic to your preferred uplink (a cheaper or faster one; whatever)
    8) Route [SOMETHING] This is disabled by default. This means older routes get preference

    The last two are here to make sure there's always a winner:

    9) lowest BGP id
    10) lowest peer address


    Weight can help a lot. More and more often, you'll see equally long AS paths from your uplinks, because they're at the same exchange points. For traffic engineering, we want the possibilty to express a preference, and that isn't going to happen in localpref - because localpref comes before the AS path length. So, we added weight. We know it somewhat clashes with weight implimentations by others, but unfortunately we asked others to help us come up with a better keyword and nobody came up with anything - so that's what it is now.

    BGPD: Parent process, kernel interface
    ======================================
    The parent process is responsible for getting the routes into the kernel. It does nexthop validation, and maintains its own copy of the kernel routing table. For that, it has to fetch the kernel routing table and the interface list at startup. It listens to the routing socket, where all changes to the kernel routing table show up as messages. It keeps the internal view in sync. The way we coded it, it does notice if you manually fill the routing table, we cope with it instead of overwriting your manual changes.

    We have an internal list of interfaces and their status, and that is kept in sync with the kernel as well. We do know about the interface link status - it's often said to be almost impossible in unix to get a link state - actually, it's not that hard. We use that for next hop verification. An interface that doesn't have a cable plugged in probably doesn't lead to a useable next hop. Yes, that means we notice when you quickly pull the cable, and we invalidate the next hop.

    We do not need periodic next-hop table walks at all, like "a big vendor" and Zebra do. This means we react much faster to interface state changes - there's an up to 30 second delay in Cisco routers and Zebra installations.

    The internal view of the routing table can be coupled and decoupled from the kernel. This originally was a debugging feature, because we had a problem with one of my test machines that didn't have enough memory, but it turned out to be very useful. It's really fast - with a full table, it takes less than 3 seconds on a P3 750mhz. The parent process needs about 6 to 7mb total, with full view configurations.


    TCP MD5 Signatures
    ==================
    TCP sessions are [typically?] unauthenticated as we know, so we implemented TCP MD5 signatures as a security association in the IPSEC framework. They are really just a special form of an IPSEC authentication header. This means I had to code a pfkey interface in BGPd, which was not really fun, to interact with the IPSEC framework.

    TCP MD5 signatures are not a new attack vector. There are people spreading that, but it's pretty much FUD. Of course, by the time you hit the TCP MD5 code, you already have to (correctly) hit the sequence number, the port number, the right addresses - the chance to hit all that is pretty low, and even then - MD5 is really cheap. The conclusion from that is, it's kind of weak - but it's extremely easy to configure, and it works with almost everything out there, so why not go ahead and use it.

    Comments
    1. By Anonymous Coward (24.27.87.53) on

      (Part 2 of 2)

      IPSEC Integration
      =================
      Since we have the pfkey interface already, it was not too hard to do real IPSEC. BGPd loads the security associations into the kernel, and sets up the flows (routes) for IPSEC. Juniper can do static keyed IPSEC as well, and we're compatible with that. As far as I know, Cisco cannot - there might be some expert featureset that you can pay extra for, but I dont know.

      Instead of doing static keying, we can also use isakampd to do the keying for us - which also means the keys are changed on a regular basis. BGPd asks the kernel for an unused pair of SPI's (identifiers), and uses them. BGPd loads the flows into the kernel - it's usually done by isakampd, but in this case, it's done by BGPd, because BGPd already knows the enpoints. That means that isakampd only needs to handle the keying - isakampd needs very little configuration. Everyone who's ever had to write an isakampd configuration file will value that. Here's a complete howto:


      1) copy the keyfiles, which are generated during the first boot of OpenBSD, over to your peer.
      2) Run isakampd -ka
      3) done. :)

      PF Integration
      ==============
      The BGP protocol is an efficient way to distribute lists of network prefixes - it doesn't necessarily need to be routes. BGPd can add prefixes learned from it's neighbors into a pf table. The prefixes to add to the table are selected using the filter language. The tables in PF use a radix tree - which is the same code used by the kernel routing tables; it's very fast, even with a lot of entries. In turn, PF tables can be used for pretty much everything. You can do packet filtering based on that: you can redirect packets, for example, a userland spam daemon -- which in turn means you're using BGP distributed spam blacklists instead of using the stupid DNS-based approach. Or, you can do QoS processing.

      Route Labels
      ============
      BGPd can attach labels to routes. Labels are basically 32 bytes of freetext information that can be attached to the route and stored with the route in the kernel table. Well, they're not stored directly, but who cares about the implimentation details :P PF can then filter based on those labels, and then write rules classify traffic for QoS. For example, you can pick all routes labeled as MCI and apply QoS - you can tell your customers that MCI is always very slow, and forget to mention that you play a part in that. :)

      Combining BGP information with pf capabilities is really very powerful. You can limit states per source address, depending on the source AS. Lets say you have your broadband ISP; you know where the hackers are, and can limit DDoS effects by limiting connections per IP address to 10, or something like that. You can also use the maximum source connection rate features in PF to fight off DDoS, and filter based on origin AS numbers.

      Integration with CARP
      =====================
      CARP is the Common Address Redundancy Protocol. This allows you to share an IP address in a master backup scenario. It's much like VERP, but unencumbered by patents. It's actually better because it's properly authenticated and faster. A typical case is exchange points - you get one IP address in the exchange point network - what about using two boxes there, and having the IP address shared using CARP. It works without special support from BGPd, but yeah - we can do better.

      If we make BGPd aware of the CARP master/backup state, we can force sessions that depend on the CARP interface. We force them in state idle, so they don't even try to connect as long as we are not master. The moment we become master, all sessions depending on the CARP interface immediately try to connect to the neighbor, which in turn leads to way faster failover.

      IPv6 support has been implimented since WTH 2005. Almost everything "just works" like IPv4. Lots of testing needed, but so far it doesn't look bad

      The config file is split into 5 sections. (1) macro definitions, just like in pf (no suprise - same code) (2) global settings, (3) neighbors to announce, (4) neighbor definitions, and (5) filters.

      (Here, ~ 18 minutes into the presentation, he reviews the configuration files. I cannot read them; the video from the real media stream is very poor. sorry.).

      There is ~ 16 minutes or so of presentation beyond this point.




      Comments
      1. By Anonymous Coward (64.229.134.21) on

        Wow, thanks for writting this all up!!! RealMedia sucks, why couldn't they just use mpg or avi instead? (Rhetorical question)

    2. By Anonymous Coward (64.229.134.21) on

      So, we added weight. We know it somewhat clashes with weight implimentations by others, but unfortunately we asked others to help us come up with a better keyword and nobody came up with anything - so that's what it is now.

      Just a thought, but what about 'cost' like a 'cost metric'?

      Comments
      1. By Anonymous Coward (206.132.94.6) on

        Because cost is already used to refer to the overall "cost" of the route or AS path anyway. It would be confusing to try to use that as a term for an attribute due to double meaning.

    3. By Leen Besselink (82.75.30.141) on

      There is ofcourse also the slides here:

      http://unduli.bsws.de/papers/nanog36/index.html

      They are much easier to read than the realvid.

  5. By Anonymous Coward (216.220.225.229) on

    I can't seem to get to the video. Are too many people trying to get to it? Is this a sign that OpenBSD is getting popular?

  6. By Anonymous Coward (134.58.253.131) on

    Actually the video is just showing the slides, with the audio of course. The slides are here too, but in much better quality so that you can actually read them:

    http://unduli.bsws.de/papers/nanog36/

Latest Articles

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]