Contributed by Peter N. M. Hansteen on from the sec'd and routed dept.
In a message to the tech@ mailing list on July 4th, 2023, David Gwynne (dlg@) presented a diff that adds a new virtual network interface dubbed sec(4). The message reads,
Subject: sec(4): route based ipsec vpns From: David Gwynne <david () gwynne ! id ! au> Date: 2023-07-04 5:26:30 tl;dr: this adds sec(4) p2p ip interfaces. Traffic in and out of these interfaces is protected by IPsec security associations (SAs), but there's no flows (security policy database (SPD) entries) associated with these SAs. The policy for using the sec(4) interfaces and their SAs is route-based instead. Longer version: I was going to use "make ipsec great again^W" as the subject line, but thought better of it. The reason I started on this was to better interoperate with "site-to-site" vpns, in particular AWS Site-to-Site VPNs, and the Auto-Discovery VPN (ADVPN) stuff on fortinet fortigate appliances. Both of these negotiate IPsec tunnels that can carry any traffic at the IPsec level, but use BGP and routes to direct traffic into those tunnels.
sec(4) is equivalent to a gif(4) interface with its encapsulated
packets protected by ESP in transport mode. You route packets into the
interface (sec or gif), and it gets encrypted and sent to the peer,
which decaspulates the traffic. The main difference is in how the
SAs for these connections are negotiated.
Neither of these things want to negotiate esp transport mode to protect
gif(4) packets, they want to negotiate esp tunnel mode for 0.0.0.0/0 to
0.0.0.0/0. The fact that IPsec in tunnel mode and gif both use the same
ip protocol number also causes a lot of confusion in the kernel in the
SPD.
After trying a bunch of different configurations out, and then trying to
hack up ipsecctl and isakmpd, and then talking to markus@, tobhe@, and
sthen@, we came up with sec(4). The idea isn't unique to us though. It
has been mooted in RFC3884 section 4.1.1, Cisco has VTI, Juniper has
st0, Linux has vti and xfrm interfaces, FreeBSD has ipsec_if, NetBSD has
ipsecif...
The kernel has been modified so ike daemons can inject a SA with
an iface extention message attached which specifies which sec(4)
the SA is for, and which direction it should be processing traffic
for. If a SA has this iface config on it, the ipsp code skips the
SPD side of things and instead makes these SAs available to sec(4)
for it to use.
I've tweaked isakmpd and ipsecctl so they support new config options
that let you configure SAs for sec(4). Most of the changes in isakmpd
are so it can continue to negotiate the right stuff with the peer,
but then short circuits the kernel config so only the SAs with the
iface extension are injected, none of the flows get inserted.
tobhe@ has done the same for iked, but he's reused the "iface"
config and special cased the handling of sec interfaces.
For ipsecctl and isakmpd, config looks like this in ipsec.conf:
h_self="130.102.96.46v
h_s2s1="52.65.9.248"
h_s2s1_key="one"
h_s2s2="54.153.175.223"
h_s2s2_key="two"
ike interface sec0 local $h_self peer $h_s2s1 \
main auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 28800 \
quick auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 3600 \
psk $h_s2s1_key
ike interface sec1 local $h_self peer $h_s2s2 \
main auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 28800 \
quick auth hmac-sha2-256 enc aes-256 group modp3072 lifetime 3600 \
psk $h_s2s2_key
sec interface config:
dlg@ix ~$ sudo cat /etc/hostname.sec0
inet 169.254.64.94 255.255.255.252 169.254.64.93
up
dlg@ix ~$ sudo cat /etc/hostname.sec1
inet 169.254.105.134 255.255.255.252 169.254.105.133
up
aws s2s says we can then talk bgp:
dlg@ix ~$ sudo cat /etc/bgpd.conf
AS 65001
router-id 130.102.96.46
group aws {
remote-as 64512
neighbor 169.254.64.93
neighbor 169.254.105.133
}
with isakmpd running and ipsecctl having injected its config into
it, it then sets up SAs:
dlg@ix ~$ sudo ipsecctl -sa
FLOWS:
No flows
SAD:
esp tunnel from 54.153.175.223 to 130.102.96.46 spi 0x13ca145b auth hmac-sha2-256 enc \
aes-256 esp tunnel from 52.65.9.248 to 130.102.96.46 spi 0x8e5fec4b auth \
hmac-sha2-256 enc aes-256 esp tunnel from 130.102.96.46 to 54.153.175.223 spi \
0xc9d2adc1 auth hmac-sha2-256 enc aes-256 esp tunnel from 130.102.96.46 to \
52.65.9.248 spi 0xca1adc30 auth hmac-sha2-256 enc aes-256 dlg@ix ~$ sudo ipsecctl -sa \
-v FLOWS:
No flows
SAD:
esp tunnel from 54.153.175.223 to 130.102.96.46 spi 0x13ca145b auth hmac-sha2-256 enc \
aes-256 sa: spi 0x13ca145b auth hmac-sha2-256 enc aes
state mature replay 16 flags 0x204<tunnel,udpencap>
lifetime_cur: alloc 0 bytes 752 add 1684451878 first 1684451880
lifetime_hard: alloc 0 bytes 0 add 3600 first 0
lifetime_soft: alloc 0 bytes 0 add 3240 first 0
address_src: 54.153.175.223
address_dst: 130.102.96.46
identity_src: type prefix id 0: 54.153.175.223/32
identity_dst: type prefix id 0: 130.102.96.46/32
src_mask: 0.0.0.0
dst_mask: 0.0.0.0
protocol: proto 0 flags 0
flow_type: type use direction in
src_flow: 0.0.0.0
dst_flow: 0.0.0.0
udpencap: udpencap port 4500
lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451888
counter:
9 input packets
2044 input bytes
853 input bytes, decompressed
9 packets dropped on input
replay: rpl 9
interface: sec1 direction in
esp tunnel from 52.65.9.248 to 130.102.96.46 spi 0x8e5fec4b auth hmac-sha2-256 enc \
aes-256 sa: spi 0x8e5fec4b auth hmac-sha2-256 enc aes
state mature replay 16 flags 0x204<tunnel,udpencap>
lifetime_cur: alloc 0 bytes 528 add 1684451878 first 1684451882
lifetime_hard: alloc 0 bytes 0 add 3600 first 0
lifetime_soft: alloc 0 bytes 0 add 3240 first 0
address_src: 52.65.9.248
address_dst: 130.102.96.46
identity_src: type prefix id 0: 52.65.9.248/32
identity_dst: type prefix id 0: 130.102.96.46/32
src_mask: 0.0.0.0
dst_mask: 0.0.0.0
protocol: proto 0 flags 0
flow_type: type use direction in
src_flow: 0.0.0.0
dst_flow: 0.0.0.0
udpencap: udpencap port 4500
lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451887
counter:
6 input packets
1416 input bytes
597 input bytes, decompressed
6 packets dropped on input
replay: rpl 6
interface: sec0 direction in
esp tunnel from 130.102.96.46 to 54.153.175.223 spi 0xc9d2adc1 auth hmac-sha2-256 enc \
aes-256 sa: spi 0xc9d2adc1 auth hmac-sha2-256 enc aes
state mature replay 16 flags 0x204<tunnel,udpencap>
lifetime_cur: alloc 0 bytes 511 add 1684451878 first 1684451880
lifetime_hard: alloc 0 bytes 0 add 3600 first 0
lifetime_soft: alloc 0 bytes 0 add 3240 first 0
address_src: 130.102.96.46
address_dst: 54.153.175.223
identity_src: type prefix id 0: 130.102.96.46/32
identity_dst: type prefix id 0: 54.153.175.223/32
src_mask: 0.0.0.0
dst_mask: 0.0.0.0
protocol: proto 0 flags 0
flow_type: type use direction out
src_flow: 0.0.0.0
dst_flow: 0.0.0.0
udpencap: udpencap port 4500
lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451888
counter:
8 output packets
1136 output bytes
671 output bytes, uncompressed
replay: rpl 9
interface: sec1 direction out
esp tunnel from 130.102.96.46 to 52.65.9.248 spi 0xca1adc30 auth hmac-sha2-256 enc \
aes-256 sa: spi 0xca1adc30 auth hmac-sha2-256 enc aes
state mature replay 16 flags 0x204<tunnel,udpencap>
lifetime_cur: alloc 0 bytes 452 add 1684451878 first 1684451882
lifetime_hard: alloc 0 bytes 0 add 3600 first 0
lifetime_soft: alloc 0 bytes 0 add 3240 first 0
address_src: 130.102.96.46
address_dst: 52.65.9.248
identity_src: type prefix id 0: 130.102.96.46/32
identity_dst: type prefix id 0: 52.65.9.248/32
src_mask: 0.0.0.0
dst_mask: 0.0.0.0
protocol: proto 0 flags 0
flow_type: type use direction out
src_flow: 0.0.0.0
dst_flow: 0.0.0.0
udpencap: udpencap port 4500
lifetime_lastuse: alloc 0 bytes 0 add 0 first 1684451887
counter:
7 output packets
1004 output bytes
592 output bytes, uncompressed
replay: rpl 8
interface: sec0 direction out
dlg@ix ~$ ifconfig sec
sec0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
index 14 priority 0 llprio 3
groups: sec
inet 169.254.64.94 --> 169.254.64.93 netmask 0xfffffffc
sec1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
index 15 priority 0 llprio 3
groups: sec
inet 169.254.105.134 --> 169.254.105.133 netmask 0xfffffffc
dlg@ix ~$ ping -qc4 169.254.64.93
PING 169.254.64.93 (169.254.64.93): 56 data bytes
--- 169.254.64.93 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 16.878/17.062/17.230/0.131 ms
dlg@ix ~$ ping -qc4 169.254.105.133
PING 169.254.105.133 (169.254.105.133): 56 data bytes
--- 169.254.105.133 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 15.110/15.690/16.538/0.524 ms
and bgp comes up:
dlg@ix ~$ sudo bgpctl sh
Neighbor AS MsgRcvd MsgSent OutQ Up/Down State/PrfRcvd
169.254.64.93 64512 2534 2505 0 00:01:43 1
169.254.105.133 64512 4140 4137 0 00:01:38 1
dlg@ix ~$ sudo bgpctl sh rib in
flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
S = Stale, E = Error
origin validation state: N = not-found, V = valid, ! = invalid
aspa validation state: ? = unknown, V = valid, ! = invalid
origin: i = IGP, e = EGP, ? = Incomplete
flags vs destination gateway lpref med aspath origin
N-? 100.64.64.0/22 169.254.105.133 100 100 64512 i
N-? 100.64.64.0/22 169.254.64.93 100 200 64512 i
ive got equivalent config with iked working, but tobhe@ wrote that
so i don't think it's fair for me to steal his thunder.
thoughts? is it worth continuing with?
The message then goes on to the diff itself, which you can take in from a mailbox near you if you are subscribed to tech@ or from one of the mailing list archives, such as this one.
If you have the time, skill and resources to test and report back, please do!
(Comments are closed)

By mxb (mxb) maxim@unixconn.com on
Looks promising. But why MTU is only 1280?
Setup with ipsec protected veb (etherip + vport) gives 1500.
Comments
By mxb (mxb) maxim@unixconn.com on
I think I understand why, but if nothing can be done about MTU. Then it is as it is. at the end it is better to have one interface for this than many in a bundle.
By David Gwynne (dlg) dlg@openbsd.org on
sec(4) is an IP tunnel like and is largely compatible with gif(4). gif(4) also defaults to an MTU of 1280. The lower MTU means that encapsulated packets are less likely to be fragmented between the endpoints.
etherip(4) is an Ethernet tunnel, and defaults to 1500 so it can be added to things like veb(4) or tpmr(4) and work because L2 has not path mtu discovery mechanism. The cost of this compatibility with L2 networks is that the encapsulated packet going to be fragmented, so 1 big Ethernet packet inside the tunnel will end up being 2 packets between the tunnel endpoints.