RB4011 gradually stops accepting traffic on LAN Gateway bridge
Posted: Mon May 13, 2024 10:44 pm
We have solved this issue after a 15 hour day by 2 engineers. Fix took 5 minutes, so just in case it helps someone else. Fix was to change the IP Address of the affected Bridge then change Gateway settings on affected servers to match the new Gateway IP.
SITUATION
SITUATION
- RB4011 with several LAN bridges
- Firewall rules for Input Chain and Forward Chain both end with a Drop All. All Dropped traffic is logged.
- Mix of IP \Firewall \ Address Lists and Interface \ Lists are used for refining various (simple) IP Firewall Filter and IP Firewall NAT rules
- Been working fine for years.
- Saturday night 1030pm, approx at peak of massive Solar Storm, RDS (Remote Desktop Server), one of 3 Virtual servers on the LAN, stops reporting in to our monitoring software
- no worries, not in use over weekend.
- Sunday morning, small investigation finds the server is up and running, just not responding on internet
- restart of affected server resolves issue, with moderate testing of all basic functions all running clean
- Monday morning 7am staff arrive and cannot sign in to RDS, ven though all servers are all running in the VMHost, and all are showing up online to our monitoring system
- None of the traffic in question ever showed up in RB4011 logs so we repeatedly concluded "its not the Mikrotik blocking traffic".
- initial investigation showed the RDS and other servers could SOMETIMES ping google.com but at same time COULD NOT PING 8.8.8.8
- after a while, none of the Virtual servers could ping each other, or the gateway, or any IP or URL upstream of the gateway
- MANY fixes tried, in Windows and ESET firewalls, NLAsvc configuration, Network Adapter configurations, etc;
- Found from VM BB8-DC at 192.168.0.17 can ping Host VMswitch 192.168.0.250 but cannot ping Mikrotik Gateway at 192.168.0.1. Somehow Mikrotik, or something on the VMHOST blocking traffic to 192.168.0.1. Tried but could not find what was blocking the traffic.
- moved one of the VMs to another VMHOST, exactly same issue.
- Fixed by crazy "jiggle it" fix: change Mikrotik SERVER-VMS-bridge name back to old name LANbridge, and address from 192.168.0.1 to 192.168.0.254.
- wtf?? Any ideas anyone why on earth that fix worked?
- anyone doing a PhD in the whole class of "jiggle it" fixes in IT?