Contributed by Peter N. M. Hansteen on from the sec'd and routed dept.
In a message to the tech@
mailing list on July 4th, 2023, David Gwynne (dlg@
) presented a diff that adds a new virtual network interface dubbed sec(4)
. The message reads,
Subject: sec(4): route based ipsec vpns From: David Gwynne <david () gwynne ! id ! au> Date: 2023-07-04 5:26:30 tl;dr: this adds sec(4) p2p ip interfaces. Traffic in and out of these interfaces is protected by IPsec security associations (SAs), but there's no flows (security policy database (SPD) entries) associated with these SAs. The policy for using the sec(4) interfaces and their SAs is route-based instead. Longer version: I was going to use "make ipsec great again^W" as the subject line, but thought better of it. The reason I started on this was to better interoperate with "site-to-site" vpns, in particular AWS Site-to-Site VPNs, and the Auto-Discovery VPN (ADVPN) stuff on fortinet fortigate appliances. Both of these negotiate IPsec tunnels that can carry any traffic at the IPsec level, but use BGP and routes to direct traffic into those tunnels.
sec(4) is equivalent to a gif(4) interface with its encapsulated packets protected by ESP in transport mode. You route packets into the interface (sec or gif), and it gets encrypted and sent to the peer, which decaspulates the traffic. The main difference is in how the SAs for these connections are negotiated. Neither of these things want to negotiate esp transport mode to protect gif(4) packets, they want to negotiate esp tunnel mode for 0.0.0.0/0 to 0.0.0.0/0. The fact that IPsec in tunnel mode and gif both use the same ip protocol number also causes a lot of confusion in the kernel in the SPD. After trying a bunch of different configurations out, and then trying to hack up ipsecctl and isakmpd, and then talking to markus@, tobhe@, and sthen@, we came up with sec(4). The idea isn't unique to us though. It has been mooted in RFC3884 section 4.1.1, Cisco has VTI, Juniper has st0, Linux has vti and xfrm interfaces, FreeBSD has ipsec_if, NetBSD has ipsecif... The kernel has been modified so ike daemons can inject a SA with an iface extention message attached which specifies which sec(4) the SA is for, and which direction it should be processing traffic for. If a SA has this iface config on it, the ipsp code skips the SPD side of things and instead makes these SAs available to sec(4) for it to use. I've tweaked isakmpd and ipsecctl so they support new config options that let you configure SAs for sec(4). Most of the changes in isakmpd are so it can continue to negotiate the right stuff with the peer, but then short circuits the kernel config so only the SAs with the iface extension are injected, none of the flows get inserted. tobhe@ has done the same for iked, but he's reused the "iface" config and special cased the handling of sec interfaces. For ipsecctl and isakmpd, config looks like this in ipsec.conf: h_self="130.102.96.46v h_s2s1="52.65.9.248" h_s2s1_key="one" h_s2s2="54.153.175.223" h_s2s2_key="two" ike interface sec0 local $h_self peer $h_s2s1 \ main auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 28800 \ quick auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 3600 \ psk $h_s2s1_key ike interface sec1 local $h_self peer $h_s2s2 \ main auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 28800 \ quick auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 3600 \ psk $h_s2s2_key sec interface config: dlg@ix ~$ sudo cat /etc/hostname.sec0 inet 169.254.64.94 255.255.255.252 169.254.64.93 up dlg@ix ~$ sudo cat /etc/hostname.sec1 inet 169.254.105.134 255.255.255.252 169.254.105.133 up aws s2s says we can then talk bgp: dlg@ix ~$ sudo cat /etc/bgpd.conf AS 65001 router-id 130.102.96.46 group aws { remote-as 64512 neighbor 169.254.64.93 neighbor 169.254.105.133 } with isakmpd running and ipsecctl having injected its config into it, it then sets up SAs: dlg@ix ~$ sudo ipsecctl -sa FLOWS: No flows SAD: esp tunnel from 54.153.175.223 to 130.102.96.46 spi 0x13ca145b auth hmac-sha2-256 enc \ aes-256 esp tunnel from 52.65.9.248 to 130.102.96.46 spi 0x8e5fec4b auth \ hmac-sha2-256 enc aes-256 esp tunnel from 130.102.96.46 to 54.153.175.223 spi \ 0xc9d2adc1 auth hmac-sha2-256 enc aes-256 esp tunnel from 130.102.96.46 to \ 52.65.9.248 spi 0xca1adc30 auth hmac-sha2-256 enc aes-256 dlg@ix ~$ sudo ipsecctl -sa \ -v FLOWS: No flows SAD: esp tunnel from 54.153.175.223 to 130.102.96.46 spi 0x13ca145b auth hmac-sha2-256 enc \ aes-256 sa: spi 0x13ca145b auth hmac-sha2-256 enc aes state mature replay 16 flags 0x204<tunnel,udpencap> lifetime_cur: alloc 0 bytes 752 add 1684451878 first 1684451880 lifetime_hard: alloc 0 bytes 0 add 3600 first 0 lifetime_soft: alloc 0 bytes 0 add 3240 first 0 address_src: 54.153.175.223 address_dst: 130.102.96.46 identity_src: type prefix id 0: 54.153.175.223/32 identity_dst: type prefix id 0: 130.102.96.46/32 src_mask: 0.0.0.0 dst_mask: 0.0.0.0 protocol: proto 0 flags 0 flow_type: type use direction in src_flow: 0.0.0.0 dst_flow: 0.0.0.0 udpencap: udpencap port 4500 lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451888 counter: 9 input packets 2044 input bytes 853 input bytes, decompressed 9 packets dropped on input replay: rpl 9 interface: sec1 direction in esp tunnel from 52.65.9.248 to 130.102.96.46 spi 0x8e5fec4b auth hmac-sha2-256 enc \ aes-256 sa: spi 0x8e5fec4b auth hmac-sha2-256 enc aes state mature replay 16 flags 0x204<tunnel,udpencap> lifetime_cur: alloc 0 bytes 528 add 1684451878 first 1684451882 lifetime_hard: alloc 0 bytes 0 add 3600 first 0 lifetime_soft: alloc 0 bytes 0 add 3240 first 0 address_src: 52.65.9.248 address_dst: 130.102.96.46 identity_src: type prefix id 0: 52.65.9.248/32 identity_dst: type prefix id 0: 130.102.96.46/32 src_mask: 0.0.0.0 dst_mask: 0.0.0.0 protocol: proto 0 flags 0 flow_type: type use direction in src_flow: 0.0.0.0 dst_flow: 0.0.0.0 udpencap: udpencap port 4500 lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451887 counter: 6 input packets 1416 input bytes 597 input bytes, decompressed 6 packets dropped on input replay: rpl 6 interface: sec0 direction in esp tunnel from 130.102.96.46 to 54.153.175.223 spi 0xc9d2adc1 auth hmac-sha2-256 enc \ aes-256 sa: spi 0xc9d2adc1 auth hmac-sha2-256 enc aes state mature replay 16 flags 0x204<tunnel,udpencap> lifetime_cur: alloc 0 bytes 511 add 1684451878 first 1684451880 lifetime_hard: alloc 0 bytes 0 add 3600 first 0 lifetime_soft: alloc 0 bytes 0 add 3240 first 0 address_src: 130.102.96.46 address_dst: 54.153.175.223 identity_src: type prefix id 0: 130.102.96.46/32 identity_dst: type prefix id 0: 54.153.175.223/32 src_mask: 0.0.0.0 dst_mask: 0.0.0.0 protocol: proto 0 flags 0 flow_type: type use direction out src_flow: 0.0.0.0 dst_flow: 0.0.0.0 udpencap: udpencap port 4500 lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451888 counter: 8 output packets 1136 output bytes 671 output bytes, uncompressed replay: rpl 9 interface: sec1 direction out esp tunnel from 130.102.96.46 to 52.65.9.248 spi 0xca1adc30 auth hmac-sha2-256 enc \ aes-256 sa: spi 0xca1adc30 auth hmac-sha2-256 enc aes state mature replay 16 flags 0x204<tunnel,udpencap> lifetime_cur: alloc 0 bytes 452 add 1684451878 first 1684451882 lifetime_hard: alloc 0 bytes 0 add 3600 first 0 lifetime_soft: alloc 0 bytes 0 add 3240 first 0 address_src: 130.102.96.46 address_dst: 52.65.9.248 identity_src: type prefix id 0: 130.102.96.46/32 identity_dst: type prefix id 0: 52.65.9.248/32 src_mask: 0.0.0.0 dst_mask: 0.0.0.0 protocol: proto 0 flags 0 flow_type: type use direction out src_flow: 0.0.0.0 dst_flow: 0.0.0.0 udpencap: udpencap port 4500 lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451887 counter: 7 output packets 1004 output bytes 592 output bytes, uncompressed replay: rpl 8 interface: sec0 direction out dlg@ix ~$ ifconfig sec sec0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280 index 14 priority 0 llprio 3 groups: sec inet 169.254.64.94 --> 169.254.64.93 netmask 0xfffffffc sec1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280 index 15 priority 0 llprio 3 groups: sec inet 169.254.105.134 --> 169.254.105.133 netmask 0xfffffffc dlg@ix ~$ ping -qc4 169.254.64.93 PING 169.254.64.93 (169.254.64.93): 56 data bytes --- 169.254.64.93 ping statistics --- 4 packets transmitted, 4 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 16.878/17.062/17.230/0.131 ms dlg@ix ~$ ping -qc4 169.254.105.133 PING 169.254.105.133 (169.254.105.133): 56 data bytes --- 169.254.105.133 ping statistics --- 4 packets transmitted, 4 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 15.110/15.690/16.538/0.524 ms and bgp comes up: dlg@ix ~$ sudo bgpctl sh Neighbor AS MsgRcvd MsgSent OutQ Up/Down State/PrfRcvd 169.254.64.93 64512 2534 2505 0 00:01:43 1 169.254.105.133 64512 4140 4137 0 00:01:38 1 dlg@ix ~$ sudo bgpctl sh rib in flags: * = Valid, > = Selected, I = via IBGP, A = Announced, S = Stale, E = Error origin validation state: N = not-found, V = valid, ! = invalid aspa validation state: ? = unknown, V = valid, ! = invalid origin: i = IGP, e = EGP, ? = Incomplete flags vs destination gateway lpref med aspath origin N-? 100.64.64.0/22 169.254.105.133 100 100 64512 i N-? 100.64.64.0/22 169.254.64.93 100 200 64512 i ive got equivalent config with iked working, but tobhe@ wrote that so i don't think it's fair for me to steal his thunder. thoughts? is it worth continuing with?
The message then goes on to the diff itself, which you can take in from a mailbox near you if you are subscribed to tech@
or from one of the mailing list archives, such as this one.
If you have the time, skill and resources to test and report back, please do!
(Comments are closed)
By mxb (mxb) maxim@unixconn.com on
Looks promising. But why MTU is only 1280?
Setup with ipsec protected veb (etherip + vport) gives 1500.
Comments
By mxb (mxb) maxim@unixconn.com on
I think I understand why, but if nothing can be done about MTU. Then it is as it is. at the end it is better to have one interface for this than many in a bundle.
By David Gwynne (dlg) dlg@openbsd.org on
sec(4) is an IP tunnel like and is largely compatible with gif(4). gif(4) also defaults to an MTU of 1280. The lower MTU means that encapsulated packets are less likely to be fragmented between the endpoints.
etherip(4) is an Ethernet tunnel, and defaults to 1500 so it can be added to things like veb(4) or tpmr(4) and work because L2 has not path mtu discovery mechanism. The cost of this compatibility with L2 networks is that the encapsulated packet going to be fragmented, so 1 big Ethernet packet inside the tunnel will end up being 2 packets between the tunnel endpoints.