OpenBSD Journal

ospf6d -- Going Crazy with IPv6

Contributed by weerd on from the routing-with-larger-addresses dept.

Claudio Jeker (claudio@), our favourite network hacker from Zürich, Switzerland, writes in with a story about his work on ospf6d:

A few days ago I decided it was time to enable ldpd(8) and ospf6d(8) in the builds since without additional attention they will never get finished. ldpd still needs a kernel with option MPLS enabled to be usable but this does not really matter here. This is about ospf6d and what drives me crazy about IPv6.

Check out the rest of Claudio's adventures in IPv6 land below.

ospf6d is the IPv6 counterpart of ospfd(8). Similar to IPv4 to IPv6 it is not just a change of addressing bits but a more or less new protocol that just shares a lot of code with ospfd. The OSPF Link State Database was extended with additional types and a lot more abstraction was added to the protocol. It seems to be common that everything that involves IPv6 has to be more abstracted and therefor a magnitude more complex. Some time ago Stefan Sperling did an awesome job in implementing most of the IPv6 specific LSDB changes needed. So in the end basic SPF calculation was/is already working only the nexthops caused some issues. This is where I jumped in right after enabling the daemon. I built a very simple test setup and tried to figure out why it did not work.

First of all I had to fix some bugs that came from copying ospfd code over. The initial routing table sync for the FIB was requesting AF_INET instead of AF_INET6 routes -- no wonder the fib was empty all the time. Now the FIB started to look correct but the routes calculated by the RDE were not present in the FIB.

So I looked at the "ospf6ctl show rib" output to figure out why:

# ospf6ctl show rib
Destination          Nexthop           Path Type    Type      Cost    Uptime
2001:4bf8:c0de:66::2   ::                Intra-Area   Router    10      00:00:03
2001:4bf8:c0de:1000::/64 ::                Intra-Area   Network   10      00:00:08

There was no nexthop information and in the log there was a note that no interface could be found for ifindex 4. Since this error was in calc_nexthop_lladdr() I knew where to start looking. After some head scratching it was clear that there was some confusion with the various interface ids passed around and fixing that was not too hard.
But still no luck now the rib looked correct but the kernel was unhappy.

# ospf6ctl show rib
Destination          Nexthop           Path Type    Type      Cost    Uptime
2001:4bf8:c0de:66::2 fe80::20c:30ff:fe10:d6e0 Intra-Area   Router    10      00:00:07
2001:4bf8:c0de:1000::/64 ::                Intra-Area   Network   10      00:00:12

The log claimed:

send_rtmsg: action 1, prefix 2001:4bf8:c0de:66::2/128: Network is unreachable

and the FIB had this bit:

*O     ::ffff:0.0.0.0/0     ::1
*O     2001:4bf8:c0de:66::2/0 fe80::20c:30ff:fe10:d6e0
*O     2001:4bf8:c0de:1000::/0
*O     2002::/0             ::1

It looked fine but it wasn't. That link local address is missing the typical %sis0 that all these IPs have. Ugh. The scope was missing and since the IPv6 addresses stored in a struct in6_addr have no scope it was necessary to track another number -- the interface id -- outside of the address. Now struct sockaddr_in6 got this sin6_scope_id field added which is most probably the cause of the insane non-power of 2 size of that struct but that's another story. So after passing additional scope ids around it looked good but the error was still there:

# ospf6ctl show rib
Destination          Nexthop           Path Type    Type      Cost    Uptime
2001:4bf8:c0de:66::2/128 fe80::20c:30ff:fe10:d6e0%sis0 Intra-Area   Network   10      00:00:07
2001:4bf8:c0de:1000::/64 ::                Intra-Area   Network   10      00:00:12

From the FIB:

*S     ::ffff:0.0.0.0/96    ::1
*O     2001:4bf8:c0de:66::2/128 fe80::20c:30ff:fe10:d6e0%sis0
*C     2001:4bf8:c0de:1000::/64 link#1
*S     2002::/24            ::1

but still

send_rtmsg: action 1, prefix 2001:4bf8:c0de:66::2/128: Network is unreachable

So lets look at the route message with route monitor:

RTM_ADD: Add Route: len 172, priority 32, table 0, pid: 18841, seq 1, errno 51
flags:
use:        0   mtu:        0    expire:        0
locks:  inits:
sockaddrs: 
 2001:4bf8:c0de:66::2 fe80::20c:30ff:fe10:d6e0%sis0 ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff

Looks good, some flags are missing when comparing with doing it by hand but that's a quick fix:

route -n add -inet6 2001:4bf8:c0de:66::2 -prefixlen 128 fe80::20c:30ff:fe10:d6e0%sis0

RTM_ADD: Add Route: len 168, priority 8, table 0, pid: 11564, seq 1, errno 0
flags:
use:        0   mtu:        0    expire:        0
locks:  inits:
sockaddrs: 
 2001:4bf8:c0de:66::2 fe80::20c:30ff:fe10:d6e0%sis0 ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff

But even after fixing the flags no luck. What the hell was going on? route -v monitor was maybe not printing something so lets ktrace it and look at the raw messages in the hexdump output.

So here is ospf6d's message

 16955 route    GIO   fd 3 read 172 bytes
   0000:  ac 00 04 02 58 00 00 00 00 00 20 00 07 00 00 00  Ќ...X..... .....
   0010:  07 40 00 00 00 00 00 00 b2 3f 00 00 02 00 00 00  .@......В?......
   0020:  03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   0030:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   0050:  00 00 00 00 00 00 00 00 1c 18 00 00 00 00 00 00  ................
   0060:  20 01 4b f8 c0 de 00 66 00 00 00 00 00 00 00 02   .KјРо.f........
   0070:  00 00 00 00 1c 18 00 00 00 00 00 00 fe 80 00 00  ............ў...
   0080:  00 00 00 00 02 0c 30 ff fe 10 d6 e0 01 00 00 00  ......0џў.жр....
   0090:  1c 18 00 00 00 00 00 00 ff ff ff ff ff ff ff ff  ........џџџџџџџџ
   00a0:  ff ff ff ff ff ff ff ff 00 00 00 00              џџџџџџџџ....

and the one from route(8):

 23771 route    GIO   fd 3 read 168 bytes
   0000:  a8 00 04 01 58 00 01 00 00 00 08 00 07 00 00 00  Ј...X...........
   0010:  47 08 00 00 00 00 00 00 ad 4d 00 00 01 00 00 00  G.......­M......
   0020:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   0030:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   0040:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   0050:  00 00 00 00 00 00 00 00 1c 18 00 00 00 00 00 00  ................
   0060:  20 01 4b f8 c0 de 00 66 00 00 00 00 00 00 00 02   .KјРо.f........
   0070:  00 00 00 00 1c 18 00 00 00 00 00 00 fe 80 00 01  ............ў...
   0080:  00 00 00 00 02 0c 30 ff fe 10 d6 e0 00 00 00 00  ......0џў.жр....
   0090:  18 18 00 00 00 00 00 00 ff ff ff ff ff ff ff ff  ........џџџџџџџџ
   00a0:  ff ff ff ff ff ff ff ff                          џџџџџџџџ

A route message is a header -- which does not matter here -- and various struct sockaddrs of various length. The additional 0 at the end of the first message was not a problem since the size filed was correct but further up was something strange:

1c 18 00 00 00 00 00 00 fe 80 00 00 00 00 00 00 02 0c 30 ff fe 10 d6 e0 01 00 00 00

that's the nexthop struct sockaddr_in6 with the link local address fe80::20c:30ff:fe10:d6e0 and a scope_id of 1 but the one from route(8) is different:

1c 18 00 00 00 00 00 00 fe 80 00 01 00 00 00 00 02 0c 30 ff fe 10 d6 e0 00 00 00 00

There is no scope_id set at the end of the address but instead it is fiddled into the address itself. But that's illegal, I'm not allowed to do that and normally the kernel complains loudly about that.

So instead of doing it the way it is intended I used what some people know as KAME hack and fumbled the interface index into the struct in6_addr and now it is working.

The invention of link local addresses was one of the worst ideas ever. It should have been obvious at the moment that scope ids were needed but for some reason nobody in the committee was willing to stand up and stop the crap. For me it is obvious that link local addressing causes more issues then the one it tried to solve in the first place.

We got screwed by committee and all I got is this rant.
--
:wq Claudio

PS: ospf6d is now working for simple stuff but there are still some open issues. If you like to test do so but note there are no type 5 AS-ext LSA at the moment and there are many minor things that need to be fixed.

Thanks, Claudio, for an insightful and interesting read. If you're in a position to test ospf6d, please take the latest snapshots out for a spin and give Claudio some feedback.

(Comments are closed)


Comments
  1. By Denis (Denis) openbsd@ledeuns.net on

    Thank you Claudio for that very interesting reading :)

  2. By Mathieu Goessens (geb) gebura@poolp.org on http://gebura.eu.org

    Links to ldpd.conf(5) and ldpctl(8) on http://www.openbsd.org/cgi-bin/man.cgi?query=ldpd&apropos=0&sektion=0&manpath=OpenBSD+Current&arch=i386&format=html are down. It works when you select "All sections" but not any of the specific ones.

    Thanks for this interesting reading :)

Latest Articles

Credits

Copyright © - Daniel Hartmeier. All rights reserved. Articles and comments are copyright their respective authors, submission implies license to publish on this web site. Contents of the archive prior to as well as images and HTML templates were copied from the fabulous original deadly.org with Jose's and Jim's kind permission. This journal runs as CGI with httpd(8) on OpenBSD, the source code is BSD licensed. undeadly \Un*dead"ly\, a. Not subject to death; immortal. [Obs.]