we have set up a site-to-site VPN connecting our main office (running strongSwan 5.5.1 on Debian Stretch) to a branch office (using a RB2011iL with RouterOS 6.40.8 ). It's a IKEv2 tunnel using a PSK and the default key lifetime of 1 hour.
Everything works great, but every now and then (maybe around once a week), the rekeying of the tunnel goes wrong: after the rekey, the two sides are no longer in agreement about which keys to use.
After such a rekey, the SAs on strongSwan side look like this:
Code: Select all
src <STRONGSWAN_IP> dst <MIKROTIK_IP>
proto esp spi 0x09890c39 reqid 2 mode tunnel
replay-window 0 flag af-unspec
auth-trunc hmac(sha256) 0x511b4b8066dd9f91e028411ac41edd0a0c2ca686191d4201d20cf699a88fc4da 128
enc cbc(aes) 0x74a144af92c8b3be372a04dac3c19c8c
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
src <MIKROTIK_IP> dst <STRONGSWAN_IP>
proto esp spi 0xcff9f231 reqid 2 mode tunnel
replay-window 32 flag af-unspec
auth-trunc hmac(sha256) 0x162e38a241b3c7fe6f18a61f9f8a45ffd005b4e4e801e3e502462b9f31e3c5f9 128
enc cbc(aes) 0x06ef2900e20f793e87b535e56c80a2a5
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
While the Mikrotik side uses the following SAs:
Code: Select all
1 E spi=0x9890C39 src-address=<STRONGSWAN_IP> dst-address=<MIKROTIK_IP> state=mature
auth-algorithm=sha256 enc-algorithm=aes-cbc enc-key-size=128
auth-key="3d5138d833aa55ceaee9801707c48a4dfb80e29fdc899cba6e39b188a03bc83b"
enc-key="c30de4b999ba413af5da8d2d8e024001" add-lifetime=48m2s/1h3s replay=128
2 E spi=0xCFF9F231 src-address=<MIKROTIK_IP> dst-address=<STRONGSWAN_IP> state=mature
auth-algorithm=sha256 enc-algorithm=aes-cbc enc-key-size=128
auth-key="e9d1eefcb8250c7b5795c381e7c9eb7b44e67064fb1a90334335d372f5efadb3"
enc-key="a856120a0824bb2fc60aa565482d303e" addtime=aug/21/2018 10:18:10 expires-in=51m18s
add-lifetime=48m2s/1h3s current-bytes=35703 current-packets=144 replay=128
Restarting the connection immediately fixes this until the next time that a key mismatch happens after a rekey.
The log entries of the rekeying on the strongSwan side don't look very suspicious to me:
Code: Select all
2018-08-21T10:18:09.616057+02:00 09[IKE] <conn_name|1> queueing CHILD_REKEY task
2018-08-21T10:18:09.616674+02:00 09[IKE] <conn_name|1> activating new tasks
2018-08-21T10:18:09.617223+02:00 09[IKE] <conn_name|1> activating CHILD_REKEY task
2018-08-21T10:18:09.617782+02:00 09[IKE] <conn_name|1> establishing CHILD_SA conn_name{2}
2018-08-21T10:18:09.618394+02:00 09[CFG] <conn_name|1> proposing traffic selectors for us:
2018-08-21T10:18:09.618955+02:00 09[CFG] <conn_name|1> 10.11.0.0/16
2018-08-21T10:18:09.619505+02:00 09[CFG] <conn_name|1> 10.10.0.0/16
2018-08-21T10:18:09.620064+02:00 09[CFG] <conn_name|1> proposing traffic selectors for other:
2018-08-21T10:18:09.620697+02:00 09[CFG] <conn_name|1> 10.50.0.0/16
2018-08-21T10:18:09.621381+02:00 09[CFG] <conn_name|1> configured proposals: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:09.636466+02:00 09[ENC] <conn_name|1> generating CREATE_CHILD_SA request 30 [ N(REKEY_SA) SA No KE TSi TSr ]
2018-08-21T10:18:09.637105+02:00 09[NET] <conn_name|1> sending packet: from <STRONGSWAN_IP>[4500] to <MIKROTIK_IP>[4500] (496 bytes)
2018-08-21T10:18:10.571371+02:00 12[NET] <conn_name|1> received packet: from <MIKROTIK_IP>[4500] to <STRONGSWAN_IP>[4500] (480 bytes)
2018-08-21T10:18:10.572847+02:00 12[ENC] <conn_name|1> parsed CREATE_CHILD_SA response 30 [ No KE TSi TSr SA ]
2018-08-21T10:18:10.590638+02:00 12[CFG] <conn_name|1> selecting proposal:
2018-08-21T10:18:10.591675+02:00 12[CFG] <conn_name|1> proposal matches
2018-08-21T10:18:10.592406+02:00 12[CFG] <conn_name|1> received proposals: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:10.593170+02:00 12[CFG] <conn_name|1> configured proposals: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:10.593884+02:00 12[CFG] <conn_name|1> selected proposal: ESP:AES_CBC_128/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ
2018-08-21T10:18:10.594612+02:00 12[CFG] <conn_name|1> selecting traffic selectors for us:
2018-08-21T10:18:10.595126+02:00 12[CFG] <conn_name|1> config: 10.11.0.0/16, received: 10.10.0.0/16 => no match
2018-08-21T10:18:10.595605+02:00 12[CFG] <conn_name|1> config: 10.10.0.0/16, received: 10.10.0.0/16 => match: 10.10.0.0/16
2018-08-21T10:18:10.596273+02:00 12[CFG] <conn_name|1> selecting traffic selectors for other:
2018-08-21T10:18:10.596780+02:00 12[CFG] <conn_name|1> config: 10.50.0.0/16, received: 10.50.0.0/16 => match: 10.50.0.0/16
2018-08-21T10:18:10.597258+02:00 12[CHD] <conn_name|1> using AES_CBC for encryption
2018-08-21T10:18:10.597731+02:00 12[CHD] <conn_name|1> using HMAC_SHA2_256_128 for integrity
2018-08-21T10:18:10.598220+02:00 12[CHD] <conn_name|1> adding inbound ESP SA
2018-08-21T10:18:10.598695+02:00 12[CHD] <conn_name|1> SPI 0xcff9f231, src <MIKROTIK_IP> dst <STRONGSWAN_IP>
2018-08-21T10:18:10.599168+02:00 12[CHD] <conn_name|1> adding outbound ESP SA
2018-08-21T10:18:10.599642+02:00 12[CHD] <conn_name|1> SPI 0x09890c39, src <STRONGSWAN_IP> dst <MIKROTIK_IP>
2018-08-21T10:18:10.600129+02:00 12[IKE] <conn_name|1> CHILD_SA conn_name{26} established with SPIs cff9f231_i 09890c39_o and TS 10.10.0.0/16 === 10.50.0.0/16
2018-08-21T10:18:10.600670+02:00 12[IKE] <conn_name|1> reinitiating already active tasks
2018-08-21T10:18:10.601294+02:00 12[IKE] <conn_name|1> CHILD_REKEY task
2018-08-21T10:18:10.601800+02:00 12[IKE] <conn_name|1> closing CHILD_SA conn_name{21} with SPIs cca08482_i (29988 bytes) 0f456b61_o (47112 bytes) and TS 10.10.0.0/16 === 10.50.0.0/16
2018-08-21T10:18:10.602274+02:00 12[IKE] <conn_name|1> sending DELETE for ESP CHILD_SA with SPI cca08482
2018-08-21T10:18:10.602744+02:00 12[ENC] <conn_name|1> generating INFORMATIONAL request 31 [ D ]
2018-08-21T10:18:10.603211+02:00 12[NET] <conn_name|1> sending packet: from <STRONGSWAN_IP>[4500] to <MIKROTIK_IP>[4500] (80 bytes)
2018-08-21T10:18:10.605572+02:00 08[NET] <conn_name|1> received packet: from <MIKROTIK_IP>[4500] to <STRONGSWAN_IP>[4500] (96 bytes)
2018-08-21T10:18:10.606048+02:00 08[ENC] <conn_name|1> parsed INFORMATIONAL response 31 [ ]
2018-08-21T10:18:10.606520+02:00 08[IKE] <conn_name|1> CHILD_SA closed
2018-08-21T10:18:10.607076+02:00 08[IKE] <conn_name|1> activating new tasks
2018-08-21T10:18:10.607624+02:00 08[IKE] <conn_name|1> nothing to initiate
Has anyone experienced something similar to this before? As mentioned, the issue is somewhat hard to debug as it only occurs rarely and I haven't found a way to reproduce it. Any pointers are greatly appreciated.