Page 1 of 1

OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Thu Sep 07, 2017 9:55 pm
by jryanhill
I have a RB1100AHx2 that sporadically does not respond to OVPN requests coming inbound. Existing OVPN connections are still live and working, but new connections are not allowed nor seen in the Mikrotik's log. Rebooting the Mikrotik resolves the issue. Disabling and re-enabling the OVPN server does not help. This happens every 1 to 2 weeks, and can affect multiple location's Mikrotiks.

I have multiple locations with the following setup: Cisco Meraki MX appliance on the WAN. RB1100AHx2 is the default gateway in the LAN, which uses the Meraki as the its own default gateway. Meraki's VPN capabilities do not allow for certain NATing requirements, so that is why we have the Mikrotik in place. We have 1194 forwarded into the Mikrotik for OVPN, as L2TP/IPSec wouldn't be allowed to forward in, due to Meraki's limitations. The setup works great as is, with the exception of OVPN server randomly failing. The failure in the server can happen at any of the locations.

First thought would be that the Meraki is somehow blocking the port forwarding inbound. I doubt this since only a reboot of the Mikrotik is needed to fix the issue, and existing OVPN connections continue to work. I have not ran a packet capture to see if the traffic is indeed getting to the Mikrotik, but since the fix is a reboot of the Mikrotik, I have no reason to believe the traffic isn't getting to the Mikrotik. However, I do plan to run a packet capture the next time the issue takes place. Because it is weeks in between, I do not recall for certain, but I believe I ran torch on the interface and saw incoming 1194 traffic, but no response traffic to the public IP the new connection was coming in on.

Next would be too many connections. However, at any given time, there are only about 10 connections between users and other Mikrotiks. (The issue affects both client OVPN on the computer as well as Client OVPN on other Mikrotiks at other locations).

Next question was with RouterOS version. We were previously running on 6.34.3 and now are on 6.40.3 as of Tuesday night (Sept 5th). As of today, the issue is still taking place.

I have 9+ years of networking experience, beginning with Mikrotiks, so I don't believe this will be a simple issue, but I've certainly been wrong before. My guess is there is something in the RouterOS or Hardware that is causing the failure.

I believe I have provided all the information I can, but may have forgotten something. Any help is appreciated in advance, and I'll be happy to answer further questions.

Re: OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Sat Sep 09, 2017 6:19 pm
by Paternot
Did you look the logs? Send them to a external server, and crank up the log level. With luck, something will come up.

I have the same RB1100AHx2, running as firewall/router/OpenVpn server. It has about 12 constant VPN connections, plus another sporadic 8 or 9. Unitl recently was running RouterOS 6.37.3, if memory serves me right. When I upgraded it to 6.40.3 the router had an uptime of 300 days. Now, with version 6.40.3, its uptime is only 3 days.

Never had a single glitch.

Have You considered electrical problems? Surges, fluctuations, noise on the line?

Re: OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Tue Sep 12, 2017 4:17 pm
by jryanhill
The logs did not show any connections coming in, nor any other major issues, even with OVPN and Error being written to disk.

As for Electrical, it is possible, as these are for a client that I am not at regularly. However, we have no fluctuation from the Merakis, switches, or any other devices in our monitoring. There's no down time. If it were a surge or noise on the line, I would expect the device to go either completely offline, or to come back to normal after a few minutes.

Re: OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Tue Sep 12, 2017 4:52 pm
by Paternot
This is weird. The sniffer would be my next step. Just to make sure the packets are really getting to the Mikrotik, and it isn't some protection from the Meraki.

Sepaking of wich: wouldn't be something on these lines? Maybe the Meraki is overreacting to some rare use of the network. Something like heavy traffic from several nodes being interpreted as a DoS?

Re: OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Tue Sep 12, 2017 6:02 pm
by jryanhill
I don't think it would be the Meraki overreacting, since it would continue to overreact after the Mikrotik was rebooted. After rebooting the Mikrotik, it comes back up with no issues. Furthermore, existing connections continue to work with no issues.

Re: OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Wed Sep 13, 2017 5:20 pm
by Paternot
Found something that may be your problem: MTU size. From the documentation (I checked version 2.1 onward - applies to all):

"--tun-mtu n
Take the TUN device MTU to be n and derive the link MTU from it (default=1500). In most cases, you will probably want to leave this parameter set to its default value.

The MTU (Maximum Transmission Units) is the maximum datagram size in bytes that can be sent unfragmented over a particular network path. OpenVPN requires that packets on the control or data channels be sent unfragmented.

MTU problems often manifest themselves as connections which hang during periods of active usage.

It's best to use the --fragment and/or --mssfix options to deal with MTU sizing issues. "

Re: OVPN Server on RB1100AHx2 sporadically unresponsive

Posted: Mon Sep 18, 2017 4:46 pm
by jryanhill
Existing connections do not hang. Only new connections do not work. Furthermore, the MTU is indeed set to 1500 on all connections.