Contributed by jcr on from the easier-to-create-than-to-destroy dept.
Philipp Buehler ("double-p" or formerly pb@) wrote in to tell us about how he handles the problem of tearing down a stalled ipsec(4) connection when running tons of busy and important tunnels.
Since the early days of ipsec.conf(5) it's rather easy to add IPsec connections throughout the networks, so ipsec.conf(5) keeps getting longer and longer. The isakmpd(8) daemon is playing nice with it's (new) peers and the sun is shining - until it isn't. Think of five, ten or 25 tunnels humming critical traffic, and this new peer is just not accepting proposals or doing wrong in so many other ways. One ends up with half-up Phase 1 or Phase 2 connections, where either peer is trying hard to get its proposals through and one can only watch it.
Restart the whole thing? Eventually it will end with a working configuration for weirdo-peer, but it also gains angry customers losing their tunnels until it was figured out. Additionally, the mighty defaults of lifetimes will likely end in CPU spiking while calculating new keys all at the same time.
What to do about it? Obviously, it's per-tunnel configuration and especially bring-up and tear-down of individual working or not-so-working tunnels.
Starting per tunnel is easy, just make use of the 'include' directive in ipsec.conf(5) to keep your configurations separate:
$ sudo cat /etc/ipsec.conf include "/etc/IPsec/central.conf" include "/etc/IPsec/customer-1.conf" include "/etc/IPsec/new-customer.conf" $ sudo ipsecctl -f /etc/IPsec/new-customer.confIt's easy to add a new tunnels but the tricky part is to get rid of tunnels. The problem compounds until it drives one (me) crazy. With ipsecctl(8) you can use the '-F' flag to flush all the SADs and SPDs, but you can't flush just a single one.
Back in the days before ipsecctl(8), the handling was done via a FIFO into isakmpd(8) and here it starts getting messy.
From the isakmpd(8) manpage:
t [<phase>] <name> Tear down the named connection, if active. For name, the tag specified in isakmpd.conf(5) or the IP address of the remote host can be used. The optional parameter phase specifies whether to delete a phase 1 or phase 2 SA. The value `main' indicates a phase 1 connection; the value `quick' a phase 2 connection. If no phase is specified, `quick' will be assumed.Note the "if active" part above. If for whatever reason phase 1 or 2 cannot complete, the tunnel does not become "active" and the above command won't work, so tear down becomes painful.
Let's assume the following new-customer.conf with 10.1.2.2 being the OpenBSD box we're working on:
flow esp from 172.16.1.0/24 to 192.168.1.0/24 peer 10.2.1.1 ike active esp from 172.16.1.0/24 to 192.168.1.0/24 peer 10.2.1.1 \ main auth hmac-sha1 enc aes-256 group modp1536 \ quick auth hmac-sha1 enc aes-256 group modp1536 \ psk "what-a-mess"Working well:
$ sudo ipsecctl -sa | grep 10.2.1.1 flow esp in from 172.19.9.35 to 172.16.1.0/24 peer 10.2.1.1 srcid 10.1.2.2/32 dstid 10.2.1.1/32 type use flow esp out from 172.16.1.0/24 to 172.19.9.35 peer 10.2.1.1 srcid 10.1.2.2/32 dstid 10.2.1.1/32 type require esp tunnel from 10.1.2.2 to 10.2.1.1 spi 0x60096bb7 auth hmac-sha1 enc aes-256 esp tunnel from 10.2.1.1 to 10.1.2.2 spi 0xc0f6f3da auth hmac-sha1 enc aes-256Recalling the manual, just use the peer's address to tear it down:
$ sudo sh -c "echo 't 10.2.1.1' > /var/run/isakmpd.fifo"Should be all good? Check again:
$ sudo ipsecctl -sa | grep 10.2.1.1 flow esp in from 172.19.9.35 to 172.16.1.0/24 peer 10.2.1.1 srcid 10.1.2.2/32 dstid 10.2.1.1/32 type use flow esp out from 172.16.1.0/24 to 172.19.9.35 peer 10.2.1.1 srcid 10.1.2.2/32 dstid 10.2.1.1/32 type require esp tunnel from 10.1.2.2 to 10.2.1.1 spi 0x60096bb7 auth hmac-sha1 enc aes-256 esp tunnel from 10.2.1.1 to 10.1.2.2 spi 0xc0f6f3da auth hmac-sha1 enc aes-256The isakmpd(8) FIFO 't' command is doing nothing! Retry, wait, check for IKE traffic trying to tear down - NOTHING. So what is going wrong? Dare to increase loglevel?
$ sudo sh -c "echo 'D 10 99' > /var/run/isakmpd.fifo" $ sudo sh -c "echo 't 10.2.1.1' > /var/run/isakmpd.fifo"Watching /var/log/daemon, one can notice:
isakmpd[27380]: ui_teardown: teardown connection "10.2.1.1", phase 2But the SA is still there, and adding even more debugging output is really scary on busy gateways!
This time from /var/log/daemon:$ sudo sh -c "echo 'D A 99' > /var/run/isakmpd.fifo"isakmpd[27380]: ui_teardown: teardown connection "10.2.1.1", phase 2 isakmpd[27380]: sa_find: no SA matched queryNow, we have a peer 10.2.1.1 and the manual... let's have a look what is in the SAD in a "raw" manner:
$ sudo sh -c "echo S > /var/run/isakmpd.fifo" $ sudo less /var/run/isakmpd.result [..] SA name: peer-10.2.1.1 (Phase 1/Initiator) src: 10.1.2.2 dst: 10.2.1.1 [..] SA name: from-172.16.1.0/24-to-192.168.1.0/24 (Phase 2) src: 10.1.2.2 dst: 10.2.1.1It's using the generated tag-name! The generated name needs to be used in the isakmpd(8) FIFO 't' command:
$ sudo sh -c "echo 't quick from-172.16.1.0/24-to-192.168.1.0/24' > /var/run/isakmpd.fifo" $ sudo sh -c "echo 't main peer-10.2.1.1' > /var/run/isakmpd.fifo"Aaaaand it's gone!
For the heck of it, I wrote this short snippet to print the lines needed for a tear down:
# cd /var/run # echo S > isakmpd.fifo # grep -B1 ip-address-of-peer isakmpd.result | \ > awk '/Phase 1/ { printf "echo t main %s > isakmpd.fifo\n", $3 } /Phase 2/ { printf "echo t quick %s > isakmpd.fifo\n", $3}'After troubleshooting all this, I found out a simple method already exists to tear down the partially configured tunnel:
The above does exactly what is needed.# ipsecctl -d -f new-customer.confHope this is saves someone from madness in the future. I wrote about this on tech@ last month, but might be interesting for all the undeadly readers too.
The next gory part is to get rid of retries in isakmpd(8) if a phase cannot complete for whatever weirdo reason (but it's out in the wilderness). Oh, and don't try to mess with the 'd' FIFO on isakmpd(8); it wont work.
Basically, I am so mad about this interface, I'll start hacking C in a new push after slacking since 2005 :-)
Thanks Philipp for writing in and showing us one of those easily missed and tough to solve issues with IPsec.
(Comments are closed)