Hi everyone
Hoping that some of you will share your experiences of big routing tables on Mikrotik and the issues that they cause. I am trying to make a decision as to whether to stick with the MT platform (upgrade a bunch of routers) or find something else.
So for me the two biggest limitations of the platform are the poor BGP performance and the slow routing table updates. I assume the two are interlinked. MT have said that BGP is multithreaded in ROS7 but we still have no actual release date and even when I questioned staff at MUM in London they had no idea either. I think we have to assume that ROS7 is vaporware at the moment.
On the router that is worst affected (a CCR1036) it takes around 10 minutes from typing an /ip route add... command to the route being inserted into the forwarding table. Likewise querying the routing table takes an age - 15 minutes to run a '/ip route print where static=yes' is pretty common. To give an idea, this router has three BGP feeds of the full global routing table, circa 600k v4 routes at present, and another 600k-ish v4 routes learnt from 600 peers. There are also v6 peers on the same router. We don't do anything fancy on the device - no MPLS, VPLS, nothing that needs connection tracking, etc - it is literally just being used as a BGP router.
This makes working with the router very difficult but it is the 'good vs fast vs cheap' trade off that we accept in order to get several gigs of routing ability into 1U, 36 watts and <£1k purchase price.
However we are at a crossroads now in that we desperately need to get more 10G ports onto the network which involves spending significant amounts to replace several 1036's with 1072's. I really want to stick with Mikrotik, but I am scared that it will turn out to be a poor investment and that these issues will never be resolved.
Why am I telling you this? Well I am hoping others with similar setups will share their experiences, particularly of:
1. Has anything improved in the latest ROS versions? (Admittedly that router does run a fairly old version). I checked through the changelogs and nothing is listed.
2. Has anyone gone from a 1036 to a 1072 and seen this problem get better or worse? My fear here is due to the lack of multithreading: the 1036 is clocked at 1.2GHz vs the 1072 which has more cores but clocked at 1GHz. In theory, if the routing engine is single threaded, it will run slower.
3. Has anybody tried implementing Openflow in a real-world environment? As I understand it, this can hand off the BGP and routing table decisions to an x86 box so that the CCR just becomes the packet pushing device
As above, I would very much appreciate any experiences you are prepared to share.
Thanks, Chris