Community discussions

MikroTik App
 
techmngr
just joined
Topic Author
Posts: 9
Joined: Wed Mar 08, 2017 9:38 am

OSPF and BGP Issues

Fri Jun 16, 2017 12:58 pm

Hi Sir/s,

I just would like to check if anyone has encountered the issue we're having now on our 1072s.

For the past 12hrs, an unusual behavior occurred already thrice- Internet services would all of a sudden stop working.

If we check the logs, we would just see OSPF and BGP going down.

We'd resolve it by rebooting one of the 1072s - both are connected to each other.

Could there be something triggering the behavior? No changes done whatsoever before the issue occurs.
Configurations have been in place for years now and I'm thinking if this is a configuration issue then why
would a reboot solve the issue and not a configuration change?

Version is at 6.33.2 - CPU and Memory utilization is low so I guess that is not an issue.

Any suggestions what else I can check on?

Thanks in advance.
 
airbanduk
newbie
Posts: 45
Joined: Mon Jun 12, 2017 2:30 pm

Re: OSPF and BGP Issues

Fri Jun 16, 2017 1:09 pm

Have you tried a later firmware release? When was the last update and configuration change made?

I've been using 6.35 on the 1072 and they've been really stable. The only time I've seen OSPF play up without a config change is on wireless links if the signal degrades, seems the remote router needs a reboot to reconnect for some reason.
 
techmngr
just joined
Topic Author
Posts: 9
Joined: Wed Mar 08, 2017 9:38 am

Re: OSPF and BGP Issues

Fri Jun 16, 2017 1:24 pm

Here's an example of what I see on the logs..unfortunately, it just says OSPF and BGP went down..

"12:15:25 route,ospf,info OSPFv2 neighbor 10.0.0.1: state change from Full to Init
12:16:00 system,info,account user noc logged in from 103.25.176.2 via winbox
12:16:13 interface,info <customer> link down
12:17:16 route,bgp,error HoldTimer expired
12:17:16 route,bgp,error RemoteAddress=45.64.80.146
12:17:37 route,bgp,error Received notification
12:17:37 route,bgp,error Hold timer expired, subcode=0
12:18:16 route,bgp,info Failed to open TCP connection: No route to host
12:18:16 route,bgp,info RemoteAddress=45.64.80.146
12:18:17 route,bgp,error Received notification
12:18:17 route,bgp,error Hold timer expired, subcode=0
12:18:24 route,ospf,info OSPFv2 neighbor 10.0.0.1: state change from ExStart to 2-Way
12:18:28 route,bgp,info Failed to open TCP connection: Network is unreachable
12:18:28 route,bgp,info RemoteAddress=10.0.0.1
12:18:32 route,ospf,info OSPFv2 neighbor 10.0.0.1: state change from ExStart to Init
12:18:36 route,bgp,info Failed to open TCP connection: No route to host
12:18:36 route,bgp,info RemoteAddress=45.64.80.146
12:18:56 route,bgp,info Failed to open TCP connection: No route to host
12:18:56 route,bgp,info RemoteAddress=45.64.80.146
12:19:12 route,bgp,info Connection opened by remote host
12:19:12 route,bgp,info RemoteAddress=10.0.0.1
12:19:16 route,bgp,info Failed to open TCP connection: No route to host
12:19:16 route,bgp,info RemoteAddress=45.64.80.146
 
techmngr
just joined
Topic Author
Posts: 9
Joined: Wed Mar 08, 2017 9:38 am

Re: OSPF and BGP Issues

Fri Jun 16, 2017 1:29 pm

Have you tried a later firmware release? When was the last update and configuration change made?

I've been using 6.35 on the 1072 and they've been really stable. The only time I've seen OSPF play up without a config change is on wireless links if the signal degrades, seems the remote router needs a reboot to reconnect for some reason.
sir,

just 6hrs ago I upgraded to latest bug fix version 6.37.5 and so far, for the past 6hrs or so the issue hasn't re-surfaced. as for configuration change - none. No changes done hours and days before the incident occurred. no wireless configurations as well on the 1072s, we're currently using it as edge router since we're an ISP company. is it safe to assume to this is not a configuration issue?

thanks for your response.
 
airbanduk
newbie
Posts: 45
Joined: Mon Jun 12, 2017 2:30 pm

Re: OSPF and BGP Issues

Fri Jun 16, 2017 1:49 pm

Those errors are exactly what I see on CCR1009/1016 in the access network when the wireless links cause the neighbours to drop. On one side the neighbour comes up in 'Full' state, but the other cycles through the OSPF FSM in the way you've shown. I have to reboot the one that thinks it's Full to bring the neighbours back up correctly. As you don't have wireless links I can't say why it might be happening, but the symptoms seem identical.

Again, no config changes on our CCRs before this happens. If the wireless signals are tuned to a strong level, the problem disappears. Suggests to me the cause is a bad link, but the CCR must have a bug somewhere that stops OSPF from forming correctly again. I've tried using different OSPF link types - broadcast, nbma, ptp, ptmp - non of them have solved the issue. I've reverted to a script to automatically reboot the router that thinks it's 'Full', but in the core/edge I don't see how you could do this.
 
techmngr
just joined
Topic Author
Posts: 9
Joined: Wed Mar 08, 2017 9:38 am

Re: OSPF and BGP Issues

Fri Jun 16, 2017 2:06 pm

sir,

in Cisco you have a "sh tech" command that we can actually analyze - does Mikrotik have any similar commands? I'm a newbie with Mikrotik and I was hoping I could check something out of the normal "log" files in Mikrotik that would somehow give me a clue as to what is causing or being a trigger to the sudden and random "down" of ospf and bgp?

thanks.
 
User avatar
StubArea51
Trainer
Trainer
Posts: 1742
Joined: Fri Aug 10, 2012 6:46 am
Location: stubarea51.net
Contact:

Re: OSPF and BGP Issues

Fri Jun 16, 2017 3:07 pm

sir,

in Cisco you have a "sh tech" command that we can actually analyze - does Mikrotik have any similar commands? I'm a newbie with Mikrotik and I was hoping I could check something out of the normal "log" files in Mikrotik that would somehow give me a clue as to what is causing or being a trigger to the sudden and random "down" of ospf and bgp?

thanks.
supout.rif is the equivalent of a show tech in the Cisco world. You can log into your account and view the contents as well as send it into MikroTik with a ticket.

https://wiki.mikrotik.com/wiki/Manual:S ... utput_File
 
User avatar
StubArea51
Trainer
Trainer
Posts: 1742
Joined: Fri Aug 10, 2012 6:46 am
Location: stubarea51.net
Contact:

Re: OSPF and BGP Issues

Fri Jun 16, 2017 3:10 pm

As far as RouterOS version, I advise all of my clients to run bigfix code as it is much more stable in production. One other practice that can contribute to OSPF/BGP instability is running a lot of mismatched versions on the routers. 6.37.5 bugfix has worked well for a lot of our clients that depend on BGP/OSPF.
 
techmngr
just joined
Topic Author
Posts: 9
Joined: Wed Mar 08, 2017 9:38 am

Re: OSPF and BGP Issues

Mon Jun 19, 2017 1:55 pm

As far as RouterOS version, I advise all of my clients to run bigfix code as it is much more stable in production. One other practice that can contribute to OSPF/BGP instability is running a lot of mismatched versions on the routers. 6.37.5 bugfix has worked well for a lot of our clients that depend on BGP/OSPF.
Hi Sir..thanks for your response..so far I've upgraded both my 1072s to 6.37.5, will try to upgrade 2 x 1036s this weekend to the same bug fix version. I was able to generate a supout.rif file and was able to open it via the supout viewer. My question would be when would be the best time to do generate the file - right after an unusual behavior is encountered? The log files are deleted every time you reboot the router and in my case, a reboot is done to resolve the issue - temporarily that is. I guess I was hoping to have a means of finding out what triggers the behavior. Thanks again.
 
techmngr
just joined
Topic Author
Posts: 9
Joined: Wed Mar 08, 2017 9:38 am

Re: OSPF and BGP Issues

Mon Jun 19, 2017 2:00 pm

Those errors are exactly what I see on CCR1009/1016 in the access network when the wireless links cause the neighbours to drop. On one side the neighbour comes up in 'Full' state, but the other cycles through the OSPF FSM in the way you've shown. I have to reboot the one that thinks it's Full to bring the neighbours back up correctly. As you don't have wireless links I can't say why it might be happening, but the symptoms seem identical.

Again, no config changes on our CCRs before this happens. If the wireless signals are tuned to a strong level, the problem disappears. Suggests to me the cause is a bad link, but the CCR must have a bug somewhere that stops OSPF from forming correctly again. I've tried using different OSPF link types - broadcast, nbma, ptp, ptmp - non of them have solved the issue. I've reverted to a script to automatically reboot the router that thinks it's 'Full', but in the core/edge I don't see how you could do this.
Thank you sir..though we don't have any wireless features enabled on both 1072s. Last I did was to delete files on my HDD since I've also noticed it has reached 80% utilization, that gives me 20% free space on my HDD. Could it be a factor? I mean will an 80% utilization on my HDD probably cause the router to hang or stop working? I mean as I've notice every time it happens, uptime doesn't really reset so technically router is still UP, it's only my BGP and OSPF neighbors that break and recover after the reboot. :-(
 
Kevo
Frequent Visitor
Frequent Visitor
Posts: 67
Joined: Wed Oct 12, 2011 1:38 am

Re: OSPF and BGP Issues

Mon Nov 20, 2017 11:56 am

We've seen this problem a couple of times now on our 1072. It looks like something happens with OSPF and then a little while after we get the hold timer error with BGP and the routing fails. After some minutes bgp will come back up. We've only run bugfix releases and this has happened before on 6.37.5 and now on 6.39.3.

Log shows

862 Nov/19/2017 20:42:59 memory route, ospf, info OSPFv2 neighbor 172.17.2.11: state change from ExStart to Down
863 Nov/19/2017 20:43:17 memory route, ospf, info OSPFv2 neighbor 172.17.2.11: state change from ExStart to Down
864 Nov/19/2017 20:43:52 memory route, ospf, info OSPFv2 neighbor 172.17.2.11: state change from Exchange to Down
865 Nov/19/2017 20:44:57 memory route, ospf, info OSPFv2 neighbor 172.17.2.11: state change from ExStart to Down
866 Nov/19/2017 20:45:53 memory route, ospf, info OSPFv2 neighbor 172.17.2.11: state change from Init to Down
867 Nov/19/2017 20:46:55 memory route, bgp, error HoldTimer expired
868 Nov/19/2017 20:46:55 memory route, bgp, error RemoteAddress=111.222.111.123
869 Nov/19/2017 20:47:26 memory route, bgp, info Connection opened by remote host
870 Nov/19/2017 20:47:26 memory route, bgp, info RemoteAddress=111.222.111.123


Is there any way to troubleshoot this when it happens again. I think it's been a few months since it happened last.
 
jmatuska
newbie
Posts: 34
Joined: Tue Aug 24, 2010 12:50 am

Re: OSPF and BGP Issues

Thu Apr 30, 2020 1:24 am

Did anyone find a fix to this issue? We are having the same problem with OSPF neighbors dropping approximately every 12 hours which then causes our BGP peer hold timer to expire and then shortly the OSPF neighbors and BGP peer come back up. This is occurring on a CCR1072-1G-8S+ with version 6.45.8.

Any thoughts?
 
Leonardorortizm
just joined
Posts: 14
Joined: Thu Mar 17, 2022 12:34 am

Re: OSPF and BGP Issues

Wed Jun 15, 2022 7:31 pm

Same problem, any updates? and the same router 1072
 
Leonardorortizm
just joined
Posts: 14
Joined: Thu Mar 17, 2022 12:34 am

Re: OSPF and BGP Issues

Tue Jun 21, 2022 3:19 pm

I has solved the issue, the problem was that the router has receiving an high amount of updates by BGP protocol (about 2 Millions) , the router can't manage this and them kill the BGP process, BGP and OSPF runs under the same process, for this reason all is down when occurs.

the solve was to talk with internet provider and request that only sends to me the default gateway.

I'm not convenced that Mikrotik 1072 could manage Full Routing BGP. Support has said that update to RouterOs V7 could solve the initial issue.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7187
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: OSPF and BGP Issues

Tue Jun 21, 2022 3:47 pm

In ROS v7 BGP and OSPF are separate processes, so that may improve things a bit.
 
User avatar
StubArea51
Trainer
Trainer
Posts: 1742
Joined: Fri Aug 10, 2012 6:46 am
Location: stubarea51.net
Contact:

Re: OSPF and BGP Issues

Tue Jun 21, 2022 5:01 pm

In ROS v7 BGP and OSPF are separate processes, so that may improve things a bit.

Is there any internal prioritization for the routing processes in ROSv7 so they remain stable under high CPU load?

Cisco and Juniper apply a DSCP marking to routing protocols to keep them prioritized as well as giving them process priority in the control plane.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7187
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: OSPF and BGP Issues

Tue Jun 21, 2022 5:20 pm

RouterOS does not have specific traffic prioritisation scheme by default, it is up to you to apply dscp and set up queues.

As for routing process prioritisation, there are no user configurable options except affinity setting to divide protocols into multiple processes. Process itself will always try to "jump" to the core with lowest load.
 
peakwifi
just joined
Posts: 1
Joined: Mon Mar 27, 2023 9:26 pm

Re: OSPF and BGP Issues

Mon Mar 27, 2023 9:45 pm

Good day all,

We have several CCR1072 units connected to ATT fiber using BGP. We have asked them for default route only yet suspect they occasionally blast us with more routes causing a router reboot every 5 minutes. We are running 6.48.6 with the following filters setup, is there any way to further protect the routers and prevent these reboots?

/routing filter
add action=accept chain=ATT-IN prefix=0.0.0.0/0
add action=discard chain=ATT-IN
add action=accept chain=ATT-OUT comment="Site x - Shared Outbound" prefix=34.165.21.0/24
add action=accept chain=ATT-OUT comment="Site x" prefix=34.165.20.0/24 set-bgp-communities=""
add action=accept chain=ATT-OUT comment="Site x" prefix=34.165.22.0/24 set-bgp-prepend=5
add action=discard chain=ATT-OUT

Thanks

Who is online

Users browsing this forum: No registered users and 5 guests