Infrequent SA Key Mismatches Between strongSwan and RouterOS
Posted: Tue Aug 21, 2018 6:33 pm
Hi,
we have set up a site-to-site VPN connecting our main office (running strongSwan 5.5.1 on Debian Stretch) to a branch office (using a RB2011iL with RouterOS 6.40.8 ). It's a IKEv2 tunnel using a PSK and the default key lifetime of 1 hour.
Everything works great, but every now and then (maybe around once a week), the rekeying of the tunnel goes wrong: after the rekey, the two sides are no longer in agreement about which keys to use.
After such a rekey, the SAs on strongSwan side look like this:
While the Mikrotik side uses the following SAs:
From that output it appears that both sides use different auth and encryption keys for the same SPI. As a consequence, any ESP packets belonging to the connection are just being dropped on both ends.
Restarting the connection immediately fixes this until the next time that a key mismatch happens after a rekey.
The log entries of the rekeying on the strongSwan side don't look very suspicious to me:
Unfortunately I don't have any logs from RouterOS at the moment.
Has anyone experienced something similar to this before? As mentioned, the issue is somewhat hard to debug as it only occurs rarely and I haven't found a way to reproduce it. Any pointers are greatly appreciated.
we have set up a site-to-site VPN connecting our main office (running strongSwan 5.5.1 on Debian Stretch) to a branch office (using a RB2011iL with RouterOS 6.40.8 ). It's a IKEv2 tunnel using a PSK and the default key lifetime of 1 hour.
Everything works great, but every now and then (maybe around once a week), the rekeying of the tunnel goes wrong: after the rekey, the two sides are no longer in agreement about which keys to use.
After such a rekey, the SAs on strongSwan side look like this:
Code: Select all
src <STRONGSWAN_IP> dst <MIKROTIK_IP>
proto esp spi 0x09890c39 reqid 2 mode tunnel
replay-window 0 flag af-unspec
auth-trunc hmac(sha256) 0x511b4b8066dd9f91e028411ac41edd0a0c2ca686191d4201d20cf699a88fc4da 128
enc cbc(aes) 0x74a144af92c8b3be372a04dac3c19c8c
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
src <MIKROTIK_IP> dst <STRONGSWAN_IP>
proto esp spi 0xcff9f231 reqid 2 mode tunnel
replay-window 32 flag af-unspec
auth-trunc hmac(sha256) 0x162e38a241b3c7fe6f18a61f9f8a45ffd005b4e4e801e3e502462b9f31e3c5f9 128
enc cbc(aes) 0x06ef2900e20f793e87b535e56c80a2a5
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
While the Mikrotik side uses the following SAs:
Code: Select all
1 E spi=0x9890C39 src-address=<STRONGSWAN_IP> dst-address=<MIKROTIK_IP> state=mature
auth-algorithm=sha256 enc-algorithm=aes-cbc enc-key-size=128
auth-key="3d5138d833aa55ceaee9801707c48a4dfb80e29fdc899cba6e39b188a03bc83b"
enc-key="c30de4b999ba413af5da8d2d8e024001" add-lifetime=48m2s/1h3s replay=128
2 E spi=0xCFF9F231 src-address=<MIKROTIK_IP> dst-address=<STRONGSWAN_IP> state=mature
auth-algorithm=sha256 enc-algorithm=aes-cbc enc-key-size=128
auth-key="e9d1eefcb8250c7b5795c381e7c9eb7b44e67064fb1a90334335d372f5efadb3"
enc-key="a856120a0824bb2fc60aa565482d303e" addtime=aug/21/2018 10:18:10 expires-in=51m18s
add-lifetime=48m2s/1h3s current-bytes=35703 current-packets=144 replay=128
Restarting the connection immediately fixes this until the next time that a key mismatch happens after a rekey.
The log entries of the rekeying on the strongSwan side don't look very suspicious to me:
Code: Select all
2018-08-21T10:18:09.616057+02:00 09[IKE] <conn_name|1> queueing CHILD_REKEY task
2018-08-21T10:18:09.616674+02:00 09[IKE] <conn_name|1> activating new tasks
2018-08-21T10:18:09.617223+02:00 09[IKE] <conn_name|1> activating CHILD_REKEY task
2018-08-21T10:18:09.617782+02:00 09[IKE] <conn_name|1> establishing CHILD_SA conn_name{2}
2018-08-21T10:18:09.618394+02:00 09[CFG] <conn_name|1> proposing traffic selectors for us:
2018-08-21T10:18:09.618955+02:00 09[CFG] <conn_name|1> 10.11.0.0/16
2018-08-21T10:18:09.619505+02:00 09[CFG] <conn_name|1> 10.10.0.0/16
2018-08-21T10:18:09.620064+02:00 09[CFG] <conn_name|1> proposing traffic selectors for other:
2018-08-21T10:18:09.620697+02:00 09[CFG] <conn_name|1> 10.50.0.0/16
2018-08-21T10:18:09.621381+02:00 09[CFG] <conn_name|1> configured proposals: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:09.636466+02:00 09[ENC] <conn_name|1> generating CREATE_CHILD_SA request 30 [ N(REKEY_SA) SA No KE TSi TSr ]
2018-08-21T10:18:09.637105+02:00 09[NET] <conn_name|1> sending packet: from <STRONGSWAN_IP>[4500] to <MIKROTIK_IP>[4500] (496 bytes)
2018-08-21T10:18:10.571371+02:00 12[NET] <conn_name|1> received packet: from <MIKROTIK_IP>[4500] to <STRONGSWAN_IP>[4500] (480 bytes)
2018-08-21T10:18:10.572847+02:00 12[ENC] <conn_name|1> parsed CREATE_CHILD_SA response 30 [ No KE TSi TSr SA ]
2018-08-21T10:18:10.590638+02:00 12[CFG] <conn_name|1> selecting proposal:
2018-08-21T10:18:10.591675+02:00 12[CFG] <conn_name|1> proposal matches
2018-08-21T10:18:10.592406+02:00 12[CFG] <conn_name|1> received proposals: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:10.593170+02:00 12[CFG] <conn_name|1> configured proposals: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:10.593884+02:00 12[CFG] <conn_name|1> selected proposal: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:10.594612+02:00 12[CFG] <conn_name|1> selecting traffic selectors for us:
2018-08-21T10:18:10.595126+02:00 12[CFG] <conn_name|1> config: 10.11.0.0/16, received: 10.10.0.0/16 => no match
2018-08-21T10:18:10.595605+02:00 12[CFG] <conn_name|1> config: 10.10.0.0/16, received: 10.10.0.0/16 => match: 10.10.0.0/16
2018-08-21T10:18:10.596273+02:00 12[CFG] <conn_name|1> selecting traffic selectors for other:
2018-08-21T10:18:10.596780+02:00 12[CFG] <conn_name|1> config: 10.50.0.0/16, received: 10.50.0.0/16 => match: 10.50.0.0/16
2018-08-21T10:18:10.597258+02:00 12[CHD] <conn_name|1> using AES_CBC for encryption
2018-08-21T10:18:10.597731+02:00 12[CHD] <conn_name|1> using HMAC_SHA2_256_128 for integrity
2018-08-21T10:18:10.598220+02:00 12[CHD] <conn_name|1> adding inbound ESP SA
2018-08-21T10:18:10.598695+02:00 12[CHD] <conn_name|1> SPI 0xcff9f231, src <MIKROTIK_IP> dst <STRONGSWAN_IP>
2018-08-21T10:18:10.599168+02:00 12[CHD] <conn_name|1> adding outbound ESP SA
2018-08-21T10:18:10.599642+02:00 12[CHD] <conn_name|1> SPI 0x09890c39, src <STRONGSWAN_IP> dst <MIKROTIK_IP>
2018-08-21T10:18:10.600129+02:00 12[IKE] <conn_name|1> CHILD_SA conn_name{26} established with SPIs cff9f231_i 09890c39_o and TS 10.10.0.0/16 === 10.50.0.0/16
2018-08-21T10:18:10.600670+02:00 12[IKE] <conn_name|1> reinitiating already active tasks
2018-08-21T10:18:10.601294+02:00 12[IKE] <conn_name|1> CHILD_REKEY task
2018-08-21T10:18:10.601800+02:00 12[IKE] <conn_name|1> closing CHILD_SA conn_name{21} with SPIs cca08482_i (29988 bytes) 0f456b61_o (47112 bytes) and TS 10.10.0.0/16 === 10.50.0.0/16
2018-08-21T10:18:10.602274+02:00 12[IKE] <conn_name|1> sending DELETE for ESP CHILD_SA with SPI cca08482
2018-08-21T10:18:10.602744+02:00 12[ENC] <conn_name|1> generating INFORMATIONAL request 31 [ D ]
2018-08-21T10:18:10.603211+02:00 12[NET] <conn_name|1> sending packet: from <STRONGSWAN_IP>[4500] to <MIKROTIK_IP>[4500] (80 bytes)
2018-08-21T10:18:10.605572+02:00 08[NET] <conn_name|1> received packet: from <MIKROTIK_IP>[4500] to <STRONGSWAN_IP>[4500] (96 bytes)
2018-08-21T10:18:10.606048+02:00 08[ENC] <conn_name|1> parsed INFORMATIONAL response 31 [ ]
2018-08-21T10:18:10.606520+02:00 08[IKE] <conn_name|1> CHILD_SA closed
2018-08-21T10:18:10.607076+02:00 08[IKE] <conn_name|1> activating new tasks
2018-08-21T10:18:10.607624+02:00 08[IKE] <conn_name|1> nothing to initiate
Has anyone experienced something similar to this before? As mentioned, the issue is somewhat hard to debug as it only occurs rarely and I haven't found a way to reproduce it. Any pointers are greatly appreciated.