I'm trying to make a fully redundant setup which includes sham links:
- 2x CE devices in HQ
- 2x PE mikrotiks in HQ
- 2x PE mikrotiks in a remote office: one connected via a DSL link, another one connected via a 4G link
- 2x CE devices in HQ
- I configured 4x VPN tunnels between those mikrotiks: HQ1 - remote1; HQ1 - remote2; HQ2 - remote1; HQ2 - remote2
- The remote office also has a backdoor link via another remote office
The setup I made works as expected, but upon simulating various failure scenarios, I ran into some stability issues with the sham links:
- When I simulate a failure of the internet uplink on HQ1, the OSPF neighborship on HQ1 (of the sham link peer) remains in State 'Full' (on both sides). In this scenario, traffic is blackholed because the OSPF process on HQ CE-routers cannot don't know that the sham link is down. If the HQ1 would stop announcing the sham link neighborship, the OSPF process on the HQ CE-router would reroute traffic via HQ2 which still has connectivity.
- In case OSPF does detect the neighbor as down (for example when a VPN has a temporary issue), the OSPF neighborshop is often not renegotiated properly once the communication path is restored. The OSPF neighbor state then remains in Init state on one end; down state on the other end. Manual action (disabling and re-enabling the sham link) is needed to restore the neighborship. I would expect that the neighborship is restored automatically
Is there some way to make sure that ospf neighborship over sham links is automatically broken and recovered in case of link failure?
Is there some kind of keep-alive/negotiation on the sham links that can be tuned?
Kind regards,
Bert