WireGuard - load balancing two different provider

pawlisko · Mon Jun 13, 2022 5:31 pm

All,

Quick question, probably complicated answer.

I have 4 wg tunnels I am using at one time. It send traffic to 4 destinations based on either on computer IP or destination IP address - config was forged here: viewtopic.php?t=184487

Now I need to expand on this.

Let say that I have VPN provider A which is working. I have also VPN provider B and their config works as well. But I can't have 2 concurrent wg tunnels using same table to work.

How to load balance it? So two would be active - let's say that provider A is preferred but if that link is not working it pushes traffic to provider B, when provider A is back it pushes route back to provider A.

I appreciate actual code - I don't need yet another Ph. D.

Thanks is advance

sindy · Mon Jun 13, 2022 6:01 pm

How to load balance it? So two would be active - let's say that provider A is preferred but if that link is not working it pushes traffic to provider B, when provider A is back it pushes route back to provider A.

The behaviour you describe is not load distribution but mere failover.

But leaving that aside, the task of failover/load distribution between two Wireguard (or other VPN except bare IPsec) tunnels is exactly the same like the task of failover/load distribution between two WANs with NAT, and for that there are multiple recipes here on the forum, like this one.

pawlisko · Mon Jun 13, 2022 7:02 pm

But leaving that aside, the task of failover/load distribution between two Wireguard (or other VPN except bare IPsec) tunnels is exactly the same like the task of failover/load distribution between two WANs with NAT, and for that there are multiple recipes here on the forum, like this one.

I went through the recepie and I am confused as hell. Some of the are calling for VRF some are doing something else. As I do Mangle than it adds to confusion.

And yes, you are right, maybe I need something else. So for sure I need a failover, but also if this would be possible certain type of speed prioritization or balancing. So Privider A is usually much faster than B, but at certain speeds (>100Mbps or so) it would be faster to use Provider B for the new connections.

Any ideas - links where it is easier shown?

I have now 1 WAN, 1 6-in-4 tunnel, and 4 active WG.

sindy · Mon Jun 13, 2022 7:19 pm

Some of the are calling for VRF some are doing something else. As I do Mangle than it adds to confusion.
...
for sure I need a failover, but also if this would be possible certain type of speed prioritization or balancing. So Privider A is usually much faster than B, but at certain speeds (>100Mbps or so) it would be faster to use Provider B for the new connections.
...
Any ideas - links where it is easier shown?

So all in all you ask for a link where someone describes a solution exactly for your situation which is fairly unique in multiple aspects. Such a link is not likely to exist, so you have to answer yourself the basic question - do I want to spend time learning that (answer yes inevitably leads to another PhD), or do I want to have it done once and I don't care that I don't understand how it works (answer yes leads to hiring a consultant to do it for you)?

Also, my Polish is not very strong, but did you actually want to say that you wanted to use Provider A for all connections while its load is below 100 Mbps, and once it gets above that threshold, start routing new connections via Provider B?

pawlisko · Mon Jun 13, 2022 7:35 pm

So all in all you ask for a link where someone describes a solution exactly for your situation which is fairly unique in multiple aspects. Such a link is not likely to exist, so you have to answer yourself the basic question - do I want to spend time learning that (answer yes inevitably leads to another PhD), or do I want to have it done once and I don't care that I don't understand how it works (answer yes leads to hiring a consultant to do it for you)?

Also, my Polish is not very strong, but did you actually want to say that you wanted to use Provider A for all connections while its load is below 100 Mbps, and once it gets above that threshold, start routing new connections via Provider B?

Second question - yes, I would love to switch around 100Mbps to provider B for new connections

First question - I don't need situation to match my scenario 100%, 10% would be fine, but I need help with mangles as I use it but I don't fully know them. Current mangle scenarios were discussed here in the forum which provided me much appreciated help.

sindy · Mon Jun 13, 2022 9:27 pm

So let's split the requirement into parts, and let's simplify it down to two WG tunnels for a start.

to implement load management integrated with failover, you need a pair of routing tables for different link load situations. In particular, one table will prefer WG A and use WG B only if WG A is not available, and the other one will prefer WG B and use WG A only if WG B is not available.
to only allow traffic to go via available (fully operating) WG tunnels, you need something that constanly monitors their transparency all the way to the internet and makes all routes via a given tunnel inactive whenever it is not transparent
to switch between the two routing tables, you need to monitor traffic via WG A and depending on its current volume, affect what routing table your mangle rules classifying the traffic will assign to new connections.
to keep a connection on a WG tunnel via which it has been initiated no matter whether it was the preferred one or the backup one, you need to use connection marking.

Parts 1 and 2 are the subject of the Implementation->Basic Setup in @Chupaka's first post in the topic linked above, except that ISP1 will be WG A and ISP2 will be WG2. The only catch here is that the canary addresses (Host1, Host2) must not be used for anything else than monitoring of the tunnel transparency, and any traffic to each of them must not be able to take any other route than via its corresponding tunnel. @Chupaka's description doesn't need to deal with this explicitly because this requirement is automatically met where all default routes in routing table main are recursive ones, but it may not be your case - you may not want to push all traffic via the WG tunnels. Routes like dst-address=HostX distance=50 type=blackhole are a sufficient solution - if the WG interface goes down, the route to HostX route via that interface becomes inactive, but the blackhole one is still the best match for HostX so the packet to hostX won't take any other route.

Don't let the rest of the topic distract you - all you need is the initial post. VRF and other divergent topics are not applicable to your case.

Name the two tables prefer-WG-A and prefer-WG-B.

Part 3 - there is /tool/traffic-monitor that executes a script whenever traffic volume in a given direction on a given interface exceeds or falls below a threshold. I have never had a high enough traffic for long enough period of time so that I could test it practically, but the following should work:
/tool/traffic-monitor
add interface=WG-A name=WG-A-exceeds on-event=exceeded threshold=100000000 trigger=above
add interface=WG-A name=WG-A-falls-below on-event=fell-below threshold=80000000 trigger=below

To avoid drilling a hole into the flash by constantly updating configuration, the two scripts, exceeded and fell-below, will add and remove, respectively, a dynamic address-list item list=WG-A-full address=0.0.0.0/0:

/system/script
add name=exceeded source={/ip/firewall/address-list/add list=WG-A-full address=0.0.0.0/0 timeout=1w}
add name=fell-below source={/ip/firewall/address-list/remove [find list=WG-A-full]}

The mangle rules assigning routing-mark values prefer-WG-A or prefer-WG-B to initial packets of connections that should be initiated via the WG tunnels will match on src-address-list=!WG-A-full and src-address-list=WG-A-full, respectively (or dst-address-list if you use src-address-list to choose what to send via WG tunnels and what not; if you use a combination of both, you'll have to cascade the rules using action=jump).

Part 4:

add two more routing tables, use-only-WG-A and use-only-WG-B, each consisting of a single default route via the corresponding gateway. Use the actual gateway, not the canary address, as if a packet belonging to a connection that has been established via WG-A gets routed via WG-B, it gets effectively lost anyway, so there is no point in tracking the tunnel transparency for these packets.
to the initial LAN ->WGx packet of each connection, assign routing-mark value prefer-WG-A or prefer-WG-B based on the static classification criteria (what should go via WG and using which of the two routing tables in particular)
to packets received via WG-A or via WG-B and not bearing any connection-mark yet, assign a connection-mark value ended-up-on-WG-A or ended-up-on-WG-B depending on via which WG interface they came in
to those LAN->anywhere packets that bear a connection-mark, assign a routing-mark value only-use-WG-A or only-use-WG-B based on the corresponding connection-mark value

To prevent packets that should go via WG from leaking via plaintext WAN, add the following routing rules:
/routing/rule/
add routing-mark=prefer-WG-A action=lookup-only-in-table table=prefer-WG-A
add routing-mark=use-only-WG-A action=lookup-only-in-table table=use-only-WG-A
add routing-mark=prefer-WG-B action=lookup-only-in-table table=prefer-WG-B
add routing-mark=use-only-WG-B action=lookup-only-in-table table=use-only-WG-B

WireGuard - load balancing two different provider

WireGuard - load balancing two different provider

Re: WireGuard - load balancing two different provider

Re: WireGuard - load balancing two different provider

Re: WireGuard - load balancing two different provider

Re: WireGuard - load balancing two different provider

Re: WireGuard - load balancing two different provider