Page 1 of 1

RB1000 + Full BGP Table + OSPF Downlinks

Posted: Sat Oct 24, 2009 7:33 pm
by synologic
Hi,

we're using one RB1000 with one tier 1 upstream provider which advertises the full bgp table and one peering link which advertises about 7700 prefixes.
Downstream, there are routers that advertise and receive loopback addresses from the RB1000 with hello-interval 1s and dead-timer 3s.

When the tier 1 provider goes down and then comes back up, the ospf sessions reset and the CPU is 100% used all the time, which i belive causes the ospf sessions to reset.

On the tier 1 bgp neighbor, there's no route filtering other than a single rule which marks the prefixes received with a certain community.
The RB1000 is route-reflector for the rest of the network, all other peers have bgp sessions with this router and receive only the default route and internal prefixes.

Do you have any suggestions on what could cause this problem and how to work around it ?

Regards,
Viorel

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Sun Oct 25, 2009 4:04 pm
by Eising
Hi, this is unfortunately a known problem that has no real workaround. The only thing I can suggest to you is, if you are not running 4.1, you should try and upgrade. If this doesn't help, consider using an x86 system instead. Some people have reported more stable BGP sessions by using a powerful server instead of the RB1000.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Mon Oct 26, 2009 1:20 am
by synologic
Yes, i am using 4.1 unfortunately.
Needles to say, this is a big dissapointment for me, as i had high hopes for the RB1000.

I would use a more powerful x86 system, however, i would pretty much need to understand the reason for the high CPU load, as it could happen on the X86 system too. Unfortunately i cannot test before deploying an x86 server to replace the RB1000 (which im using without receiving the global prefix table), so i need to know wether this will resolve the situation or not, or just having the full bgp table is a cpu hog ?

Thanks,
Viorel

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Mon Oct 26, 2009 12:29 pm
by Eising
I'm using RB1000's too, and to a certain extend have similar issues.
I can only suggest that you contact support@mikrotik.com and raise this as an issue.
The more pressure that can be put on mikrotik to further strengthen their BGP implementation the better.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Mon Oct 26, 2009 2:30 pm
by eagle
Just curious: is this only when bgp combined with ospf ? Or does it also happen when bgp sessions are dropped and internally there is no ospf but just switched (the cpu load I mean)?

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Mon Oct 26, 2009 8:55 pm
by Eising
I'm actually not running ospf on my border routers yet, and I see it as session resets when I apply filters. This has actually improved with the latest versions.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Mon Oct 26, 2009 9:13 pm
by changeip
your hold timer shouldn't be 3s ... and the hello timer is probably way too short in reality. Try switching to higher values and see what happens. 180s hold and 30s hello I believe are the norm.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Mon Oct 26, 2009 9:24 pm
by gmsmstr
I would assume receiving all of your routes would be the issue rather than the OSPF.

I.e. when your BGP peer come up, it consumes 100% cpu during the receive of the routes. This is something that the PowerRouter does (multi-core) much better. Its not really an issue with RouterOS, but more of a single CPU and/or no limit to the CPU usage while reciving those routes.

On our PowerRouter 732, we have multiple cores, as well as with some other x86 boxes. When the peer comes up, full routing table comes in, but instead of 100% cpu, its either 50% + normal traffic, or as in our 2282, around 12-15% CPU to load up the routing table.

Maybe I missed this but just a FYI. So yes, a x86 box with multi-cpu enabled would help.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Tue Oct 27, 2009 7:19 pm
by synologic
Changeip, indeed it is, hello is at 1 second, and dead timer is at 3 seconds, but again, whats the use of running BGP over OSPF if not fast reconvergence ?

Gmsmstr, your PowerRouter would do nice with a CF/SD option, i chose RB1000 because of its lack of moving parts (aside from the fans) :)

In case this wont be solved, should i dare to ask for IS-IS support ? :)

Regards,
Viorel

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Tue Oct 27, 2009 7:20 pm
by gmsmstr
The only moving parts are the fans in the unit. ;) We use speicalized CF card for the OS, and if you want a hard drive option for caching, you can choose SSD drives as well :)

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Thu Oct 29, 2009 5:51 pm
by vlada1
change router , buy cisco 7200 or something else , ASR1002 is ok , and problem solved .

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Thu Oct 29, 2009 6:04 pm
by gmsmstr
You could, but why! No need!

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Thu Oct 29, 2009 7:21 pm
by changeip
bgp isn't made for fast convergence... try setting it back to default and see if you still have problems (other than routes that stick too long). Then you can use check-gateway=arp or ping to disable them when things are down.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Thu Oct 29, 2009 10:21 pm
by nz_monkey
An RB1000 based on the new MPC8572E multi-core PPC processors from Freescale would possibly fix the problem, hint hint ;)

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Thu Oct 29, 2009 10:24 pm
by synologic
BGP isn't for fast convergence, that's why im using OSPF to distribute loopbacks via multiple links, so in case one goes down, bgp wont reset as the loopbacks are always reachable, so BGP process taking all the CPU from the system is rather bad.

I would rather see this problem fixed than change routers, since i will need the global table on downstream routers and this problem is preventing me for doing the proper setup :)

Thanks,
Viorel

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Thu Oct 29, 2009 10:29 pm
by gmsmstr
7200s do the same thing when they first receive full BGP tables.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Wed Nov 18, 2009 9:27 pm
by hedele
7604s with Sup32 Engines also tend to break down a lot when getting full BGP feed. Even the Sup720 massively suffers in those situations.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Wed Nov 18, 2009 9:35 pm
by gmsmstr
Correct, the CPU gets maxed out while receiving and bringing in those routes.

Re: RB1000 + Full BGP Table + OSPF Downlinks

Posted: Wed Nov 25, 2009 8:36 pm
by synologic
Sup32 is not really intended to have full tables, Sup720/RSP720 do, however, i still think this has to be solved by Mikrotik, its just stupid to have bgp take all CPU and not leave anything else for other time sensitive tasks.