Routing-Test

Fri Jun 09, 2006 1:50 pm

new routing test, which supposedly fixes some BGP issues:
http://www.mikrotik.com/download/routin ... 25_new.zip

spirosco · Fri Jun 09, 2006 8:13 pm

Good work guys, unfortunatelly the problem with the bgp session crashes still exists.
Apart from the problem with the bgp routes removal in 2.9.24, the problem with the crashes has arrived with the 2.9.25.
So i guess it has something to do with the latest changes you have made.

I have sent the supout.

Thanks

Beccara · Sun Jun 11, 2006 1:51 am

When can we expect to see this in routeros-x86 package? some of us use the single file upgrade path

nikhil · Sun Jun 11, 2006 9:34 am

Please whoever uses this please post back with your experience

heimdal · Mon Jun 12, 2006 2:32 am

With standart routing-test on 2.9.25 have problem with

hold time expire. If i stay to only one ISP - no problems.

If i stay with two ISP`s - bgp sessions go somwhere faaaaar away ..

and reseting again and again ...sounds like my "dream" is to stay without internet connections

Thanks again for NEW STABLE 2.9

with suggested routing-test

I`m still wait for other problems

Another one night i can`t sleep with mikrotik in trouble ...
Thank you guys !! May be next time you want from me to paid for this "boogie nights"

nikhil · Mon Jun 12, 2006 7:15 am

You mean the new routing-test is very good ?

With standart routing-test on 2.9.25 have problem with

hold time expire. If i stay to only one ISP - no problems.

If i stay with two ISP`s - bgp sessions go somwhere faaaaar away ..

and reseting again and again ...sounds like my "dream" is to stay without internet connections

Thanks again for NEW STABLE 2.9 with suggested routing-test
I`m still wait for other problems
Another one night i can`t sleep with mikrotik in trouble ...
Thank you guys !! May be next time you want from me to paid for this "boogie nights"

Mon Jun 12, 2006 9:20 am

You mean the new routing-test is very good ?

appears so, when mentioning problems, he says `standard test`.

nikhil · Mon Jun 12, 2006 9:30 am

normis can you shed some light on [Ticket#2006060516000369] . I dont know what to do now. Its definately not my machines or the nics something internal to MT . What to do ?the interfaces stop working

Mon Jun 12, 2006 9:42 am

I just sent you an email

nikhil · Mon Jun 12, 2006 10:01 am

got it. replied back could you reply back ?

cmit · Mon Jun 12, 2006 10:37 am

Interesting way to do e-mail communicaton

SCNR

Christian Meis

nikhil · Mon Jun 12, 2006 10:45 am

Interesting way to do e-mail communicaton
SCNR

Christian Meis

How else can i notify them ?

cmit · Mon Jun 12, 2006 11:00 am

No offense intended. I myself did something similar using the chat room several times

Christian Meis

Mon Jun 12, 2006 11:46 am

how else? email itself is a notification, isn't it? i read email more often than forum

nikhil · Mon Jun 12, 2006 12:17 pm

how else? email itself is a notification, isn't it? i read email more often than forum

I did reply back with my problem but no response to that ticket yet .

Mon Jun 12, 2006 1:15 pm

you have to understant that we have a queuing mechanism, plus we are people too. problem solving is not as fast as writing emails. we will write as soon as we have an answer

nikhil · Mon Jun 12, 2006 1:51 pm

I totally appreciate your hardwork, I just switched back to routing test on a secondary machine identical . I dont see any real advantage its still 100% cpu , im have 2 peers sending me 180k routes each. routing-test is slow.

Mon Jun 12, 2006 1:59 pm

did you use routing-test from this topic? we updated it you know ...

heimdal · Mon Jun 12, 2006 2:01 pm

i use package from topic - for last 12 hours everything is OK!!!

4 BGP sessions work great for now.

Thanks from me for fix and for "network" command too.

nikhill for testing purposes i have another one router with different hardware (intel motherboard, with intel network adapters, intel cpu
- that i call "Trinity"

)

What kind of hardware use for your routers ?

I have working router with 2.9.25 and new routing-test from topic only.
And this variant works for me.
Thing twice and then go ahead again... last routing-test must works.

Mon Jun 12, 2006 2:11 pm

I checked nikhil's supout.rif and he is using the standard routing package, not test

nikhil · Mon Jun 12, 2006 2:12 pm

im using exactly that 2.9.25 new but im seeing 100% cpu while its learning the routes

nikhil · Mon Jun 12, 2006 2:14 pm

im using supermicro P4SCT+ (all on board intel CSA nic) + PCIX Intel pro 1000 MT dual port.

P4 3ghz
1gig ram
2.9.25 with routing-test new
I have it right now ;learning 90k routes 50 from one and 40 from the other but its at 100% cpu alright

Mon Jun 12, 2006 2:19 pm

new supout.rif please. last one was with incorrect package

heimdal · Mon Jun 12, 2006 2:31 pm

sorry for this post, but i must explain to nikhil.
when saw this:

I checked nikhil's supout.rif and he is using the standard routing package, not test

i have downloaded all_packages, next...

unzip and go to directory contain *.npk files.

delete routing.npk (this is normal package comming with packages)

then download to directory routing-test-2.9.25.npk - link provided by normis in topic.

then ftp to your router - set binary type transfer, upload packages and reboot.

and working routing test is up. next step is to configure BGP.

P.S. on my working router have only PCI-X slots and have one
dual port intel gigabit. other is 2 x 1 port intel gigabit and works ...
if your link to other hardware is 100 mbit/s , set gigabit adapters to propre 100 mbit/s full duplex (don`t use auto negotiate links)

this useful tip is from my experience with 100 mbit/s media converters...

Cheers

DON`T USE WINBOX routing menu. telnet or ssh to router and go ...
if you have troubles with configuration, i`m around ...may help you.

Thats my way ... and works ..

nikhil · Mon Jun 12, 2006 2:43 pm

new supout.rif please. last one was with incorrect package

Trying to dish it out but its 31 minutes and 190k routes (100 from one and 90k from other peer) still 100% cpu

supout.rif taking a lot of time !

Normis can you come on chat ill give you access to the system

heimdal · Mon Jun 12, 2006 2:46 pm

nikhil i see this parameters in my 2.9.25 with first provided routing-test,

i think you need to setup this routing-test-2.9.25.npk from topic !!!

I have 2 x 180 000 and over routes, each loaded for 30-60 seconds !!!
without this 100% cpu load.

my cpu is 3 GHz too! and 1 G ram...

nikhil · Mon Jun 12, 2006 2:52 pm

Thanks heimdal

Well here's what i have
All packages were running 2.9.25
Uploaded new routing-test (new) npk file via ftp .
Disabled regular routing, enable routing-test -- rebooted

Routing test is enabled it has my peers in it just my prefix's gone. Made filter's .Im advertising connected now though earlier i was using network . Saw 100% cpu for some time so rebooted
I have GIGE PCIX connected to cisco 2970 all GIGE switch . Im seeing 100% cpu utilization for the last 40 minutes since the 2nd reboot.

I wish it would work the way it works for you.

sorry for this post, but i must explain to nikhil.
when saw this:

I checked nikhil's supout.rif and he is using the standard routing package, not test
i have downloaded all_packages, next...

unzip and go to directory contain *.npk files.

delete routing.npk (this is normal package comming with packages)

then download to directory routing-test-2.9.25.npk - link provided by normis in topic.

then ftp to your router - set binary type transfer, upload packages and reboot.

and working routing test is up. next step is to configure BGP.

P.S. on my working router have only PCI-X slots and have one
dual port intel gigabit. other is 2 x 1 port intel gigabit and works ...
if your link to other hardware is 100 mbit/s , set gigabit adapters to propre 100 mbit/s full duplex (don`t use auto negotiate links)

this useful tip is from my experience with 100 mbit/s media converters...

Cheers

DON`T USE WINBOX routing menu. telnet or ssh to router and go ...
if you have troubles with configuration, i`m around ...may help you.

Thats my way ... and works ..

heimdal · Mon Jun 12, 2006 2:55 pm

nikhil , did you get routing-test from topic ???

in this version from normis - network command is back again (thats make me happy

)

nikhil · Mon Jun 12, 2006 2:56 pm

48 minutes cpu usage down finally after it learnt everything(185k and 155k from both peers) .Got a supout.rif
im going to send it to normis

nikhil · Mon Jun 12, 2006 2:58 pm

nikhil , did you get routing-test from topic ???

in this version from normis - network command is back again (thats make me happy )

Yes I am using that only
[admin@2] routing bgp network> print
Flags: X - disabled
# NETWORK
0 x
1
2
3
4
5
[admin@2] routing bgp network>

I have taken the number out of the list it did show my network numbers there

heimdal · Mon Jun 12, 2006 3:03 pm

nikhil, in terminal :

when you press double Tab you got "predict" for commands,
that mean you must

enable 0

add network=xxx.xxx.xxx.xxx/xx disabled=no

and you have network advertised from your router

nikhil · Mon Jun 12, 2006 3:46 pm

Heimdal

I did an export in routing bgp network

/ routing filter
add chain=bgp-in invert-match=no action=discard comment="" disabled=no
add chain=bgp-out prefix=a.a.a.a prefix-length=24 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=e.e.e.e prefix-length=24 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=d.d.d.d prefix-length=24 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=c.c.c.c prefix-length=23 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=b.b.b.b prefix-length=24 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=f.f.f.f prefix-length=26 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=c.c.c.c prefix-length=24 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out prefix=c.c.c.c prefix-length=24 invert-match=no action=accept comment="" disabled=no
add chain=bgp-out invert-match=no action=discard comment="" disabled=no
/ routing bgp instance
set default name="default" as=11111 router-id=IP.IP.IP.IP redistribute-static=no redistribute-connected=no \
redistribute-rip=no redistribute-ospf=no redistribute-other-bgp=no out-filter="" client-to-client-reflection=yes \
comment="" disabled=no
/ routing bgp peer
add instance=default remote-address=IP1 remote-as=AS1 tcp-md5-key="" multihop=no route-reflect=no hold-time=1m30s \
ttl=1 in-filter= out-filter=bgp-out comment="" disabled=no
add instance=default remote-address=IP2 remote-as=AS2 tcp-md5-key="" multihop=no route-reflect=no hold-time=0s \
ttl=1 in-filter= out-filter=bgp-out comment="" disabled=no
/ routing bgp network
add network=b.b.b.b/24 disabled=no
add network=f.f.f.f/26 disabled=no
add network=c.c.c.c/23 disabled=no
add network=d.d.d.d/24 disabled=no
add network=e.e.e.e/24 disabled=no
add network=a.a.a.a/24 disabled=no

The ip data has been replaced

nikhil, in terminal :

when you press double Tab you got "predict" for commands,
that mean you must

enable 0

add network=xxx.xxx.xxx.xxx/xx disabled=no

and you have network advertised from your router

grzesjan · Tue Jun 13, 2006 12:13 am

i use package from topic - for last 12 hours everything is OK!!!
4 BGP sessions work great for now.
Thanks from me for fix and for "network" command too.

I had everything OK for the whole weekend and today my router went crazy and I had to disable all BGP peers and leave only uplink to announce my prefixes

I have sent supout files, logs, but Mikrotik keeps silence.

And for the network command: if I setup prefixes in network I don't see them in /routing bgp advertise print X.X.X.X - is it normal? Can it be corrected?

Gregor

advantz · Tue Jun 13, 2006 4:16 am

i use package from topic - for last 12 hours everything is OK!!!

4 BGP sessions work great for now.

Thanks from me for fix and for "network" command too.

nikhill for testing purposes i have another one router with different hardware (intel motherboard, with intel network adapters, intel cpu
- that i call "Trinity" :) )

What kind of hardware use for your routers ?

I have working router with 2.9.25 and new routing-test from topic only.
And this variant works for me.
Thing twice and then go ahead again... last routing-test must works.

Please test it for 7days or so
thank you

nikhil · Tue Jun 13, 2006 7:00 am

mine is up for 24 hours but i am rejecting all in routes as it causes 100% cpu and keeps it there. Tried everything except install ALL packages instead of only the ones needed like routing-test, system and a few more. I dont see a reason why i should install wireless , arlan etc when i dont have a need for it.

nikhil · Tue Jun 13, 2006 7:00 am

mine is up for 24 hours but i am rejecting all in routes as it causes 100% cpu and keeps it there. Tried everything except install ALL packages instead of only the ones needed like routing-test, system and a few more. I dont see a reason why i should install wireless , arlan etc when i dont have a need for it.

nikhil · Tue Jun 13, 2006 7:01 am

mine is up for 24 hours but i am rejecting all in routes as it causes 100% cpu and keeps it there. Tried everything except install ALL packages instead of only the ones needed like routing-test, system and a few more. I dont see a reason why i should install wireless , arlan etc when i dont have a need for it.

nikhil · Tue Jun 13, 2006 7:12 am

mine is up for 24 hours but i am rejecting all in routes as it causes 100% cpu and keeps it there. Tried everything except install ALL packages instead of only the ones needed like routing-test, system and a few more. I dont see a reason why i should install wireless , arlan etc when i dont have a need for it.

changeip · Tue Jun 13, 2006 8:07 am

I am still seeing 100% cpu after sync sometimes as well, but not all the time. We have a development lab setup with 2 bgp peers, 2 mt routers, and the 2 routeros boxes using iBGP between them. The routers were configured directly after a /system reset. The latest routing-test works 99% better but still has a lingering cpu problem. I am sure it will be fixed (I hope) and we will go into production when it is. I saw a huge improvement in the last 2-3 updates that came out.

Nikhil, are you doing iBGP peers as well?

Sam

nikhil · Tue Jun 13, 2006 8:10 am

mine is up for 24 hours but i am rejecting all in routes as it causes 100% cpu and keeps it there. Tried everything except install ALL packages instead of only the ones needed like routing-test, system and a few more. I dont see a reason why i should install wireless , arlan etc when i dont have a need for it.

nikhil · Tue Jun 13, 2006 8:12 am

mine is up for 24 hours but i am rejecting all in routes as it causes 100% cpu and keeps it there. Tried everything except install ALL packages instead of only the ones needed like routing-test, system and a few more. I dont see a reason why i should install wireless , arlan etc when i dont have a need for it.

changeip · Tue Jun 13, 2006 8:12 am

I am still seeing 100% cpu after sync sometimes as well, but not all the time. We have a development lab setup with 2 bgp peers, 2 mt routers, and the 2 routeros boxes using iBGP between them. The routers were configured directly after a /system reset. The latest routing-test works 99% better but still has a lingering cpu problem. I am sure it will be fixed (I hope) and we will go into production when it is. I saw a huge improvement in the last 2-3 updates that came out.

Nikhil, are you doing iBGP peers as well?

Sam

nikhil · Tue Jun 13, 2006 8:40 am

I am still seeing 100% cpu after sync sometimes as well, but not all the time. We have a development lab setup with 2 bgp peers, 2 mt routers, and the 2 routeros boxes using iBGP between them. The routers were configured directly after a /system reset. The latest routing-test works 99% better but still has a lingering cpu problem. I am sure it will be fixed (I hope) and we will go into production when it is. I saw a huge improvement in the last 2-3 updates that came out.

Nikhil, are you doing iBGP peers as well?

Sam

I have 2 routers (MT) , using bgp full with 2 peers in Production. Used to use the standard bgp package. Right now im rejecting all incoming routes and using static routing to keep our network running because as soon as i let the router run it goes into a 100% cpu util. Not using ibgp. I hope they fix this pretty quick because I have another POP to setup in TX . I like to use MT over FreeBSD + Quagga only because of Winbox , torch etc -- unlike in any regular unix OS i would need to scroll through tcpdump which would be too much to handle for 2-300 mbps of traffic in/out.

heimdal · Tue Jun 13, 2006 9:32 am

For last 30 hours i get only one error :

Hold timer expired 40 minutes after midnight, then all brp peers was restarted, 30 seconds router has messages in log:
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer1 ip
RemotePort=179
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer2 ip
RemotePort=179
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer3 ip
RemotePort=179
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer4 ip
RemotePort=179

then full bgp tables loaded and everything is ok ...
no one knows for this case with hold timer, i see it in this morning when check logs regular with start of work time...
router is in production environment that mean :
connected with 2 ISPs, 4 BGP peers over 4 VLANs
networks are advertised, all is fully operational.
for now ...

and let the force be with you...

nikhil · Tue Jun 13, 2006 9:50 am

So did your network must have gone down when those sessions faled.

For last 30 hours i get only one error :

Hold timer expired 40 minutes after midnight, then all brp peers was restarted, 30 seconds router has messages in log:
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer1 ip
RemotePort=179
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer2 ip
RemotePort=179
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer3 ip
RemotePort=179
Failed to open TCP connection. Operation now in progress
RemoteAddr=peer4 ip
RemotePort=179

then full bgp tables loaded and everything is ok ...
no one knows for this case with hold timer, i see it in this morning when check logs regular with start of work time...
router is in production environment that mean :
connected with 2 ISPs, 4 BGP peers over 4 VLANs
networks are advertised, all is fully operational.
for now ...

and let the force be with you...

nikhil · Tue Jun 13, 2006 2:20 pm

Waiting for something to go down ....

Im rejecting all routes to see how long it actually holds with my peers.

believewireless · Mon Jun 19, 2006 10:40 pm

We have OSPF crashing with 2.9.26. We have to disable all networks and reenable them to get it back up.

Eugene · Tue Jun 20, 2006 12:16 pm

Send a support-output file to support@mikrotik.com

advantz · Tue Jun 20, 2006 1:46 pm

Send a support-output file to support@mikrotik.com

Yeah right, I already sent the supout.rif 2.9.26 w/r-t BGP problems
Thank You

Beccara · Thu Jun 22, 2006 9:28 am

Routing Test in 2.9.26 is fine, using it for iBGP on our network with about a dozen nodes running it - havnt had a single issue

heimdal · Thu Jun 22, 2006 9:45 am

I`m using for eBGP, 2 ISP x 185 000 routes, thats my point of view ...

glad to see iBGP is OK ...wait for other comments cause router is in production ...i wont do mistakes..

Beccara · Thu Jun 22, 2006 10:02 am

Why wait for other DIFFERNT networks setups to report in? Load it up on a trial box and peer, thats the only way to make sure .26 will work in YOUR setup

advantz · Thu Jun 22, 2006 12:13 pm

I'm using routing test for eBGP multihop, ospf, rip

My system is :
P4 2.66ghz dual core
2x512mb ddr2
intel 945gnt mobos
3x rtl8139c
1x e1000
1x dom 64mb

Maybe this dual core problem with mikrotik?
Mikrotik require bgp & debug log, I'll provide this log for 2.9.24 w/ routing test only.
Don't want to test 2.9.25 and 2.9.26 anymore, didn't work!

believewireless · Fri Jun 23, 2006 2:28 pm

Our OSPF crashed again and I sent off the supout.rif.

changeip · Fri Jun 23, 2006 7:04 pm

How many people actually have BGP working in production with > 2.9.14 routing-test? Of those how many are using more than 1 peer? We've been testing for months and can't get anything working reliably and I'm close to giving up on it.

Does it have to do with a dual-core cpu being used? I assume not, but who knows. Does it have to do with the quad intel nic that someone else posted about having problems? I assume not, but who knows. I am at my wits end with BGP/MT and am trying to determine if anyone else has a simple BGP setup working in production with > 2.9.14 & routing-test.

We've been running somewhat smooth with 2.9.6 routing-test with 2 bgp peers on the same router, but it locks up every few weeks. We are splitting the bgp functions out to 2 separate border routers (1 per peering) but so far I'm left with 2 fancy supermicro boxes that are giving me nothing in return.

Supouts have been sent as always. I think they are getting tired of me emailing them : )

Short list of experienced problems:
iBGP peer sometimes annouces routes that no longer exist in its routing table.
100% cpu forever problems when changing routing filters or disabled/enabling peers.
iBGP peers establish connection but do not exchange routes on a fresh reboot, must disable/enable instance.

(Frustrated thats for sure)

Sam

grzesjan · Fri Jun 23, 2006 7:30 pm

How many people actually have BGP working in production with > 2.9.14 routing-test? Of those how many are using more than 1 peer? We've been testing for months and can't get anything working reliably and I'm close to giving up on it.

I have similar problems. I use Mikrotik in production (currently 2.9.26) but I have only one BGP full feed and some peerings. When I have discarded all routes from uplink and used default route, situation is rather stable.

I had sometimes "action timeout", have sent them supout, but till now no reply...

Gregor

changeip · Fri Jun 23, 2006 7:38 pm

Thats the only way I got 2.9.6 stable, had to discard most routes. I think we have about 15,000 and its working okay on that production router. I know that bgp with no incoming routes is working decently, but with 2 separate border routes and trying to get them to distribute the load to the right peer I somewhat need the incoming routes : )

When you say 'some peerings' do you mean to internal routers using ibgp or other outside providers?

Sam

grzesjan · Fri Jun 23, 2006 8:05 pm

[quote="changeip"]When you say 'some peerings' do you mean to internal routers using ibgp or other outside providers?/quote]

Both - but less than 100 routes each.

Gregor.

advantz · Sat Jun 24, 2006 3:16 am

@changeip
Same things happened with me, same logs, can't re-establish bgp if peers refreshes, etc. for 2.9.22-2.9.26 w/routing-test

Always sent supout.rif to mikrotik support, don't know if they actually find out something going on in that supout.rif?
I will sent bgp+debug log after bgp crashed.
thx

advantz · Tue Jul 11, 2006 4:33 am

I already try 2.9.27 routing-test
I got problem with ethernet, ping sometimes timeout to one of bgp peer ip, causing bgp hold timer to expire (I think!)

And also routing table random bugged, e.g:

state dst-address gateway distance interface
Db 66.66.66.0/24 10.0.0.1 20 GW1
ADb 66.66.66.0/24 10.0.0.2 20 GW2

but trace always run through GW1=10.0.0.1, damn it!
note: ADb = active dynamic bgp, not all routes but random

changeip · Tue Jul 11, 2006 4:40 am

We have 2 routers in production with 2.9.27 bgp routing-test but cannot accept many incoming routes. We are doing ibgp between routers and annoucing our routes, but that's all... incoming routes from the peers cause things go haywire. I think iBgp between the 2 is where things get hairy. So far 10 days uptime though when filtering out all routers from peers.

BelWave · Thu Jul 13, 2006 6:02 am

We have 2 routers in production with 2.9.27 bgp routing-test but cannot accept many incoming routes. We are doing ibgp between routers and annoucing our routes, but that's all... incoming routes from the peers cause things go haywire. I think iBgp between the 2 is where things get hairy. So far 10 days uptime though when filtering out all routers from peers.

Yes, MikroTik BGP is not ready for "prime time" yet. This past weekend we upgraded a four core routers to v2.9.27 with routing-test. I can only describe the experience as the most unpleasant one I have ever had with MikroTik.

This was a planned upgrade and we flew in a well known and respected MikroTik Certified Consultant as well as had a seasoned BGP expert on hand for the weekend long upgrades. No amount of planning could have prevented the insurmountable problems we encountered with v2.9.27 and the new BGP module. The icing on the cake was MikroTik's decision to remove support for the Intel Fiber GigE Interfaces we were running on one of our routers. Fortunately we had spare fiber cards with a different chipset that is still supported by MikroTik.

After chasing routes inserting and removing themselves over and over for two days we decided to drop one upstream peer and only take a default route from our other upstream peer. Since then we've been stable.

Apparently MikroTik is unable to handle a full table from two peers without some sort of odd route "churning". All four routers are running 3GHz or faster Pent 4 CPUs, yet CPU usage was frequently noted at 100% All routers have 512MB RAM or more. WinBox would basically become worthless as the routers hung at 100% CPU…only rebooting would clear the CPU usage.

Has anybody heard any response from MikroTik regarding the BGP problems?

Best,

Brad

Eugene · Thu Jul 13, 2006 8:13 am

You would help a lot by sending support-output file to support@mikrotik.com

Eugene

changeip · Thu Jul 13, 2006 4:51 pm

Eugene,

Are you still using those 2 test peering here? If so, maybe I can script something up to add/remove routes similiar to that on the internet - maybe taking in 180,000 routes on 2 routers and syncing them between using ibgp works fine until there are new annoucements during that sync. We had the same problems as BelWave and had to filter the incoming routes to get stable. Now that it's in productions its hard to do any testing with those boxes : )

Sam

changeip · Fri Jul 14, 2006 5:37 pm

I hope you took a supout during this time and send to support... they're never going to fix it if they can't tell whats wrong. If you didnt take a supout then please do next time before you reboot.

The problem we had with established connections but not prefixes was to do with the remote-id of the router, it had to be explicitly set on both sides. However, i think the release notes for .27 said this was fixed or something.

Sam

heimdal · Sat Jul 15, 2006 10:28 pm

Why to send them supout ??

they have and provide links to cisco documentation! (but don`t have their implementation of BGP !!!)

And ...i said this:

Have RFC for BGP ? Why BGP didnt work ?

Finaly i think to left Mikrotik "friendly" and not working OS.

For now look at Cisco 2xxx series or something from Juniper.

Loses with those MT problems is over ...

In this case i understand completly why Cisco and Juniper rule the world ...

not "routing to 127.0.0.1" .... RULE THE WORLD !!!

Farwell MT people, MT OS bugs i`m disapointed completly

advantz · Sun Jul 16, 2006 12:22 am

@heimdal
I'm moving to zebra 0.94 and stable and fast enough!! received prefix less than 10sec...

Why don't you guys use code from quagga/bird/openbgpd please...
It's under GPL/BSD license
I really love mikrotik...

BelWave · Mon Jul 17, 2006 11:25 pm

You would help a lot by sending support-output file to support@mikrotik.com

Eugene

Hello Eugene,

Yes, I know we should have thought to generate a support-output file. Even with three people working on the problem nobody was thinking of sending supout files...instead our pants were on fire and we were only thinking of getting the network running smoothly again!

Do you believe we were seeing problems because we were trying to take two full tables? Does MikroTik have any BGP recommendations? e.g. take full routes or only default & connected, hold timer settings, etc. etc...

Any input as to why it takes so long for Winbox to show the routes? Is this a CPU problem? Should we be looking at faster CPUs? Is 512MB RAM enough? How much memory does MikroTik recommend?

Best,

Brad

heimdal · Tue Jul 18, 2006 9:47 am

advantz, mikrotik people must think before to give us some features...

i`m realy disapointed from repeatin problems ... or fix something and crash something other ...

Can i ask you to send to me conf files for quagga without you sensitive information. I have some PC here for testing purposes, and think to install bsd and quagga. Can you help ? WIth confs ?

NetTraptor · Tue Jul 18, 2006 2:36 pm

I believe that we came to a stage where troubleshooting for MT routing-test has evolved to a full time job and a really exiting knowledge quest that you get for a few bobs…

More to come, ghost routes resulting to invalid AS_PATH attributes, million of routes on a peer, filters for every attribute, BGP cpu utilization with timers 1/3, routers to a stand still with bgp debug mode ON, so on and so forth…

As soon as we have all the info collected we will try and post to support…

BelWave · Wed Jul 19, 2006 6:31 am

I believe that we came to a stage where troubleshooting for MT routing-test has evolved to a full time job and a really exiting knowledge quest that you get for a few bobs…

More to come, ghost routes resulting to invalid AS_PATH attributes, million of routes on a peer, filters for every attribute, BGP cpu utilization with timers 1/3, routers to a stand still with bgp debug mode ON, so on and so forth…

As soon as we have all the info collected we will try and post to support…

The "ghost routes" you mention have peaked my interest.

Just tonight I got a call from a client saying he couldn't reach his servers. I tried pinging his servers and got a expired in transit response. I logged into the router giving me the error and tried pinging directly from the router, but got a redirect error from another MT router. Looked at the routing table and clearly the static route was there and correct in showing AS. Only after disabling and then re-enabling the route was I then able to reach the affected subnet.

This is what we were seeing a week ago only on a MUCH larger scale while taking two full BGP tables. For the past week we have been only taking a default route from one upstream and until now we havn't seen any problems.

This time I generated a supout file immediately after re-enabling the AS route. Sent it off to MikroTik Support...I'll be interested in hearing what they have to say.

Best,

Brad

changeip · Wed Jul 19, 2006 7:25 am

routers to a stand still with bgp debug mode ON

This is a problem that can't really be solved - pumping out that much debug in a production enviroment would kill most routers. I would never leave bgp or related debug rules on while taking in 180k routes, there is too much to log and stay productive. I would LOVE to see burst option on debugging rules to help with this.

Sam

heimdal · Wed Jul 19, 2006 8:29 am

Help with WHAT , Sam ?

With this practice "sent supout.rif" ....."sent" ..."sent"
Thats the answers from MT, why just got working those features ?
Support cost much and takes the time, people and other expensible resources, when product finaly working as jalopy...wreck ...
Cost much time ...thats happen
Just watch what they said:

"New Mirkotik release, but we don`t know what we wrote and please send bugs to us..."

have "lab" with 3 quagga routers and from core send to 2 quagga full tables, with 3th i get this 2 full bgp tables with private AS... for now - working fast without problems !!!
And have command "network", and properly anounce networks ...

advantz · Thu Jul 20, 2006 6:56 am

advantz, mikrotik people must think before to give us some features...

i`m realy disapointed from repeatin problems ... or fix something and crash something other ...

Can i ask you to send to me conf files for quagga without you sensitive information. I have some PC here for testing purposes, and think to install bsd and quagga. Can you help ? WIth confs ?

quagga documentation is easy to follow given their sample configuration bgpd.conf.sample

please let me know if you still need these conf. Where is the hell "pm button"!!
OOT

IPVgg · Fri Jul 21, 2006 3:35 pm

I have some problems with "multihop" option.
My scenario: 2 AS with zebra 0.94 and 1 AS MT running on RB352A.
MT receive routes from his neighbors, but didn't place them to routing table & in routing table (in interface raw) -unknown interface.
If I put filter on Incoming BGP packets – passthrough & nexthop (IP of neighbor) - MT will put all routes to routing table & interface become correct.
I think that it's not correct situation... Cause I've put MT instead of another FreeBSD router with zebra 0.94
And reverse situation - when zebra receive routes from MT (zebra has option multihop enable) - "she" said that routes inaccessible...

changeip · Fri Jul 21, 2006 4:58 pm

Is the next-hop that you are receiving from the peer accessable / routable from MT ? If you receive a nexthop that your not sitting on you will need to 'set-nexthop' to the next hop that you can reach.

Sam

IPVgg · Mon Jul 24, 2006 10:03 am

Yes, of cause. At first we added static route for that next hop and test by ping, but MT didn't put routing information without filtering input BGP with 'set-nexthop' option.
In another BGP routing daemons option "multihop" complete with field 'hope count' (numeric) - if two routers have more then one route to each other. Do You plan to add this option field?

Eugene · Mon Jul 24, 2006 10:42 am

You would help a lot by sending support-output file to support@mikrotik.com

Eugene
Hello Eugene,

Yes, I know we should have thought to generate a support-output file. Even with three people working on the problem nobody was thinking of sending supout files...instead our pants were on fire and we were only thinking of getting the network running smoothly again!

Do you believe we were seeing problems because we were trying to take two full tables? Does MikroTik have any BGP recommendations? e.g. take full routes or only default & connected, hold timer settings, etc. etc...

Any input as to why it takes so long for Winbox to show the routes? Is this a CPU problem? Should we be looking at faster CPUs? Is 512MB RAM enough? How much memory does MikroTik recommend?

Best,

Brad

Personally, I use default hold timer settings. To be on the safe side, I limit incoming prefixes to providers' routes only (as-path-length=2), although I am not implying that the full table (or multiple full tables do not work)

Winbox is inherently unable to show many things at once. We are going to change this in 2.10. Regarding the memory requirements, look what /system resource monitor shows you. 512 should be enough in most cases (multiple tables, filters, etc.)

Eugene

Eugene · Mon Jul 24, 2006 10:44 am

Eugene,

Are you still using those 2 test peering here? If so, maybe I can script something up to add/remove routes similiar to that on the internet - maybe taking in 180,000 routes on 2 routers and syncing them between using ibgp works fine until there are new annoucements during that sync. We had the same problems as BelWave and had to filter the incoming routes to get stable. Now that it's in productions its hard to do any testing with those boxes : )

Sam

Yes, Sam, we are using these 2 peers (great thanks, btw). No need to do anything additional on your side as we are now able to do pretty extensive testing ourself.

Eugene · Mon Jul 24, 2006 10:49 am

Yes, of cause. At first we added static route for that next hop and test by ping, but MT didn't put routing information without filtering input BGP with 'set-nexthop' option.
In another BGP routing daemons option "multihop" complete with field 'hope count' (numeric) - if two routers have more then one route to each other. Do You plan to add this option field?

You have to consider that by default next hops for bgp routes are not looked up through static routes. Search the manual for "scope" and "target-scope" parameters of routes.

IPVgg · Mon Jul 24, 2006 12:37 pm

Please, explain for what purpose need looking for next hope, if packets from device (that behind) are already received?
I've put a static route trough WinBox and can’t see scope option there.
(Only trough telnet)...

Eugene · Mon Jul 24, 2006 1:02 pm

Speaking of next hops, a next hop have not to be always directly reachable in order to route packets over a particular route. Instead, a next hop could be recursively looked up via other routes.

The router does not know anything about how your network is organized. It obeys the rules written in the routing table.

changeip · Mon Jul 24, 2006 5:03 pm

Yes, Sam, we are using these 2 peers (great thanks, btw). No need to do anything additional on your side as we are now able to do pretty extensive testing ourself.

Eugene,

It has been 100+ degrees here and those 2 quagga test routers are in my garage, which is now 130+ degrees. We are also under power restrictions so I have turned those off for now. Once I move them into the office here tomorrow I will turn them back on so we can do more testing.

IPVgg · Mon Jul 24, 2006 6:26 pm

Excuse for my other stupid question… I’ve put a static route with scope<=target-scope and next-hope resolved, all routes from multihop neighbor were put in routing table without any filtering.
But now I have other trouble – MT redistribute my RIP networks trough BGP. That multihop neighbor received routes from MT, but in that list next-hop variable isn’t MT. As next-hop in this list I see RIP routers… I tried to put outgoing pass-through filter for that peer with next-hop variable set to MT ip – but nothing happened…
When I put route-map with next-hop variable set to MT ip on multihop neighbor - “zebra” rebooted immediately. May be filters with next-hop variable must not work on multihop or I can’t set next-hop variable to remote ip address of BGP router?

patagonia · Mon Jul 24, 2006 8:49 pm

Interesting way to do e-mail communicaton
SCNR

Christian Meis

Eugene · Tue Jul 25, 2006 7:50 am

Excuse for my other stupid question… I’ve put a static route with scope<=target-scope and next-hope resolved, all routes from multihop neighbor were put in routing table without any filtering.
But now I have other trouble – MT redistribute my RIP networks trough BGP. That multihop neighbor received routes from MT, but in that list next-hop variable isn’t MT. As next-hop in this list I see RIP routers… I tried to put outgoing pass-through filter for that peer with next-hop variable set to MT ip – but nothing happened…
When I put route-map with next-hop variable set to MT ip on multihop neighbor - “zebra” rebooted immediately. May be filters with next-hop variable must not work on multihop or I can’t set next-hop variable to remote ip address of BGP router?

/routing filters export
/routing bgp peer export

IPVgg · Tue Jul 25, 2006 12:47 pm

192.168.22.93 ip address of MT interface, 192.168.25.201 - Remote multihop BGP router. Other peers have no any filters.
All routes (that was redistribute buy RIP) received by AS 65507
have AS 65506 as next-hop AS, but haven't 192.168.22.93 as next-hop ip address..... In ip fields I see ip address of RIP routers.

/ routing bgp peer
add name="peer1" instance=default remote-address=192.168.22.68 remote-as=65512 \
tcp-md5-key="" multihop=no route-reflect=no hold-time=3m ttl=1 \
in-filter="" out-filter="" comment="" disabled=no
add name="peer3" instance=default remote-address=192.168.22.84 remote-as=65515 \
tcp-md5-key="" multihop=no route-reflect=no hold-time=3m ttl=1 \
in-filter="" out-filter="" comment="" disabled=no
add name="peer2" instance=default remote-address=192.168.25.201 \
remote-as=65507 tcp-md5-key="" multihop=yes route-reflect=no hold-time=3m \
ttl=1 in-filter="" out-filter=Last-hope comment="" disabled=no

/ routing filter
add chain=Last-hope invert-match=no action=passthrough \
set-nexthop=192.168.22.93 comment="" disabled=no

changeip · Tue Jul 25, 2006 10:01 pm

It would be nice to be able to specify a LE or GE statement on the prefix.

As this works I have to know which prefix lengths I want to accept, I just can't say anything >16.

Thx,
Sam

eflanery · Tue Jul 25, 2006 10:19 pm

Prefix length takes a range, i.e.:

/routing filter add chain=foo prefix-length=17-32

Works in Winbox too.

--Eric

changeip · Tue Jul 25, 2006 10:31 pm

ah, sweet... thank you!

Sam

changeip · Wed Jul 26, 2006 5:43 am

I am troubleshooting multiple routers all peering with each other, using BGP. It seems like AS path length is being somewhat ignored when choosing routes.

The 2 routes below look right, item 7 is active because the AS path length is 1.

 7 ADb dst-address=10.10.0.0/16 gateway=10.0.0.61 
interface=l2tp-to-cip-office gateway-state=reachable
distance=20 scope=255 target-scope=10
bgp-as-path=65505 bgp-origin=incomplete 
 8  Db dst-address=10.10.0.0/16 gateway=10.0.0.65
interface=l2tp-amistad-to-delmar gateway-state=reachable
distance=20 scope=255 target-scope=10
bgp-as-path=65507,65505 bgp-origin=incomplete

But 2 other routes in the same table look wrong, item 19 should be active since it's AS length is 1, but the longer route is being preferred.

 19  Db dst-address=10.40.4.0/24 gateway=10.0.0.61 
interface=l2tp-to-cip-office gateway-state=reachable
distance=20 scope=255 target-scope=10
bgp-as-path=65505 bgp-origin=incomplete 
 20 ADb dst-address=10.40.4.0/24 gateway=10.0.0.65
interface=l2tp-amistad-to-delmar gateway-state=reachable
distance=20 scope=255 target-scope=10
bgp-as-path=65507,65505 bgp-origin=incomplete

Possibly this is all the problems we are having when using ibgp to exchange routes with each other - the wrong routes are being chosen? If I disable and reenable a bgp peer, the last one to come up seems to be the preferred route... doesn't seem right.

2.9.27 routing-test

Eugene · Wed Jul 26, 2006 12:31 pm

Do you have /routing bgp instance <number> ignore-as-path-len set to "no"?

Hammy · Wed Jul 26, 2006 3:45 pm

I just read an article in PC Magazine about 802.11N devices and a paragraph in there made me think of the Mikrotik world.

Software upgrades may help performance, but expecting customers to perform multiple firmware or driver updates to reach minimal functionality is completely unacceptable. So is releasing immature products just to be early to market and treating purchasers as your quality-assurance department. In the end, that hurts both consumers and vendors.

changeip · Wed Jul 26, 2006 4:27 pm

Do you have /routing bgp instance <number> ignore-as-path-len set to "no"?

It is not set on any instances ...

2 name="instance-amistad" as=65505 router-id=10.0.0.61
redistribute-static=yes redistribute-connected=yes redistribute-rip=no
redistribute-ospf=no redistribute-other-bgp=yes out-filter=""
client-to-client-reflection=no

3 name="instance-delmar" as=65505 router-id=10.0.0.1 redistribute-static=yes
redistribute-connected=yes redistribute-rip=no redistribute-ospf=no
redistribute-other-bgp=yes out-filter="" client-to-client-reflection=no

Sam

Russ · Fri Jul 28, 2006 12:55 pm

I'm using the latest 2.9.27 routing test package on 9 routerboard 500's. The routers are connected by fast ethernet and wireless backbone links, all point to point connections run /30’s and all /30’s are in the backbone area.

My clients connect in via PPPoE to wireless access interfaces (on each of the 7 access routers). Each wireless interface has been allocated 1 /27 ip pool (and equlivant ppp profile). I have configured the /27’s to route into OSPF AS as a separate area (ie area 1,2,3 etc).

At the moment every time a user connects his or her /32 is redistributed to the entire OSPF AS. What I am after is some form of route summarization to route just the /27 as opposed to each /32 that connects. I have searched the forums, and docs to try and find this solution, to no avail.

On cisco I can simply

Router ospf 100
summary-address 10.12.1.0 255.255.255.224

My other issue with OSPF has been mentioned in the forums previously, I have struck this twice now where the routes on each side of a /30 will get stuck in an Init state.

Turning on some debugging discovers that the packets are being sent out on the multicast but from the wrong address. Ie the /30 was 10.0.0.1/30, the OSPF router id was 10.0.1.254 (interface br0) yet the remote router log reports the packet arriving on 10.0.0.2 from 172.19.0.1 (another interface of the router).

The only way to fix this was to remove the 10.0.0.1 and 10.0.0.2 addresses from both routers and re apply them, then re add the OSPF network of 10.0.0.0/30.

Eugene · Fri Jul 28, 2006 12:58 pm

2Sam: If these 2 routes are from different instances, they are not compared by BGP code (AS_PATH length does not matter).

Eugene · Fri Jul 28, 2006 1:21 pm

toRuss: Add a static /27 route to the router. redistribute static routes instead of connected.

changeip · Fri Jul 28, 2006 4:42 pm

2Sam: If these 2 routes are from different instances, they are not compared by BGP code (AS_PATH length does not matter).

I don't think this is right. The RIB should be 100% independent of which instance it learned it from I believe. So routing table is storing instance somewhere ?

Sam

Russ · Sat Jul 29, 2006 7:20 am

toRuss: Add a static /27 route to the router. redistribute static routes instead of connected.

That seems like an odd way to do things, what to use as the gateway address? localhost?

Eugene · Mon Jul 31, 2006 9:33 am

Russ: exactly.

changeip · Mon Jul 31, 2006 6:57 pm

Eugene,

I am taking a BGP feed from cymru.com which is a list of bogon addresses. I am setting the routing-mark on these as they come in using routing filters, but for some reason I cannot figure out how to make these routes 'Active' ... I saw your other post about how target scope had something to do with it, tried that, but no luck. Can you look at the following and tell me how I can make these routes active?

[user@cip-office] ip route> print routing-mark=bogons terse
Flags: X - disabled, A - active, D - dynamic, 
C - connect, S - static, r - rip, b - bgp, o - ospf 
 0  Db dst-address=1.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 1  Db dst-address=2.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 2  Db dst-address=5.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 3  Db dst-address=7.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 4  Db dst-address=10.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 5   S dst-address=10.40.0.1/32 gateway=10.40.1.1 interface=0-inside gateway-
            state=reachable 
          distance=1 scope=255 target-scope=10 routing-mark=bogons 
 6  Db dst-address=23.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 7  Db dst-address=27.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 8  Db dst-address=31.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp 
 9  Db dst-address=36.0.0.0/8 gateway=10.40.0.1 interface=9-loopback gateway-
            state=reachable 
          distance=20 scope=255 target-scope=10 routing-mark=bogons bgp-as-
            path=65333 
          bgp-med=0 bgp-origin=igp

Item #5 is a static route i put in so that the next-hop is reachable, but it doesn't seem to become active, nor does the bgp routes... I am trying to redistribute these routes to another peer but since they are not active they won't. 10.40.0.1 is on a loopback interface.

Thx,
Sam

changeip · Tue Aug 01, 2006 1:08 am

Figured it out ... I had to add a 'rule' in the rules tab that referenced the routing-mark - just a placeholder I guess. What's even weirder is that I deleted the rule right after verifying that fixed it, and the routes stayed active ... so I don't know why the 'jiggling' fixed it.

Eugene · Tue Aug 01, 2006 8:46 am

the calculation process probably has not finished by the time you were looking at these routes. Though without support-output it's difficult to tell.

Eugene

changeip · Tue Aug 01, 2006 9:11 am

I will see after rebooting if I have to jiggle things again to make them active. The routes were there for probably 2 weeks, they only became active once I references the routing-table using the rules tab. If I can reproduce it I will send a supout before and after.

How is progress on bgp? Are there fixes in .28 ? We still cannot take routing tables from 2 peers and sync them between 2 mikrotiks.

Thx,
Sam

Eugene · Tue Aug 01, 2006 9:20 am

There are numerous routing fixes in .28. Could you switch on those bgp peers?

advantz · Fri Aug 04, 2006 7:28 pm

.28??

2.9.28??
Where is it?

changeip · Fri Aug 04, 2006 9:17 pm

There are numerous routing fixes in .28. Could you switch on those bgp peers?

Eugene,

10.20.1.203 is AS65533, 10.20.1.204 is AS65534. I just got the second one up. It has the same routes as the first (which is okay) - I plan on making a script to insert/remove routes on both so they are acting like real peers.

Thx,
Sam

BelWave · Sat Aug 05, 2006 6:29 am

There are numerous routing fixes in .28. Could you switch on those bgp peers?

Hello Eugene,

Is there a target release date on v2.9.28?

Are there fewer routing problems with v2.9.26 - 25 - 24 - 23, etc, etc? I'd rather not stay on v2.9.27 if .26 or earlier has fewer routing problems.

Thanks,

Brad

Mon Aug 07, 2006 2:30 pm

today or tomorrow

karyal · Tue Aug 08, 2006 10:02 pm

2Sam: If these 2 routes are from different instances, they are not compared by BGP code (AS_PATH length does not matter).

mmm thi explains a problem i have here... i've a full bgp setup, with confederations.
iBGP is on the whole network (around 500 nodes, of which around 50 run ibgp and form the network backbone).
I cannot "close" the network, or looping will always occur (which shouldn't with confederations).
Also, sometimes a couple of routes are lost, or are not correctly chosen..

I used different session for each peer, and this explains why ASPATH is ignored and thus loops occur.
The point is i had to use such setup, else i couldn't redistribute bgp routes learned between routers...
I understand now that there must be some error on my setup, but to be honest i cannot find an example on how this should be set on MT.
I can post anything of my current setup, i really need to solve the "looping" or loosing routes problem.
MT is 2.9.23 to 2.9.27, routing test

Thanks,
Ricky

changeip · Tue Aug 08, 2006 10:31 pm

Multiple peer entries are okay, but you cannot use more than 1 instance. If you use more than 1 instance you end up with multiple views of the routing table that do not 'see' other instances ... it's a problem and should not be that way. BGP instances are there to allow different router IDs, redistribute settings, AS numbers, but not to separate the routing table.

There are some other bugs that will hopefully be fixed in .28 - I wouldn't change anything on your config until you try .28 and see what happens.

Sam

karyal · Wed Aug 09, 2006 12:13 am

Multiple peer entries are okay, but you cannot use more than 1 instance. If you use more than 1 instance you end up with multiple views of the routing table that do not 'see' other instances ... it's a problem and should not be that way. BGP instances are there to allow different router IDs, redistribute settings, AS numbers, but not to separate the routing table.

Ok, this is fine.. the point is : i use multiple instances because i seem to be unable to propagate bgp routes bewteen routers withous sessions.
i.e. if i have three routers A B and C
A routes propagate to B but not to C
B routes propagate to A and C
C routes propagate to B but not to A

I understand this is because the routers do not see the routes as reacheable, and this is fine... with "usual" bgp stuff, like cisco o quagga i do solve this by using nexthop... with mt seems to be ignored..
Is there any working example around, i did try to find one , but didn't get any.
Thanks,
Ricky

changeip · Wed Aug 09, 2006 12:43 am

with "usual" bgp stuff, like cisco o quagga i do solve this by using nexthop... with mt seems to be ignored..

You can 'filter' the routes to set-nexthop= just like in the cisco. If they are getting ignored it might be a config issue with the peer and its filter chain. We have quite a few chains that perform set-nexthop and they seem to work fine. If you need help getting it to work post some configs.

karyal · Wed Aug 09, 2006 12:53 am

You can 'filter' the routes to set-nexthop= just like in the cisco. If they are getting ignored it might be a config issue with the peer and its filter chain. We have quite a few chains that perform set-nexthop and they seem to work fine. If you need help getting it to work post some configs.[/quote]
I know i can (i do that on a 2.9.27 mt that receives some routes from a quagga box).
But when i tried to do it with two MT it simply got ignored.
I'm setting up a test so that i can post the configuration, thanks..
Ricky

karyal · Wed Aug 09, 2006 1:35 am

Ok, this is what happens.. this is router A config


/ routing filter 
add chain=next invert-match=no action=passthrough set-nexthop=80.79.50.206 \
    comment="" disabled=no 
/ routing bgp instance 
set default name="default" as=65048 router-id=80.79.49.217 \
    redistribute-static=yes redistribute-connected=yes redistribute-rip=no \
    redistribute-ospf=no redistribute-other-bgp=yes out-filter="" \
    confederation=34695 confederation-peers=65000 \
    client-to-client-reflection=no comment="" disabled=no 
/ routing bgp peer 
add name="peer1" instance=default remote-address=80.79.50.206 remote-as=65000 \
    tcp-md5-key="" multihop=no route-reflect=no hold-time=3m ttl=1 \
    in-filter=next out-filter="" comment="" disabled=no

and this is router B

#
/ routing filter 
add chain=next invert-match=no action=passthrough set-nexthop=80.79.50.206 \
    comment="" disabled=no 
set default name="default" as=65000 router-id=80.79.49.121 \
    redistribute-static=yes redistribute-connected=yes redistribute-rip=no \
    redistribute-ospf=no redistribute-other-bgp=yes out-filter="" \
    confederation=34695 confederation-peers=65001,65048,65000 \
    client-to-client-reflection=no comment="" disabled=no 
/ routing bgp peer 
add name="peer1" instance=default remote-address=80.79.50.121 remote-as=65001 \
    tcp-md5-key="" multihop=no route-reflect=no hold-time=3m ttl=1 \
    in-filter="" out-filter="" comment="" disabled=no 
add name="peer2" instance=default remote-address=80.79.50.205 remote-as=65048 \
    tcp-md5-key="" multihop=no route-reflect=no hold-time=3m ttl=1 \
    in-filter="" out-filter=next comment="" disabled=no

What i get is that router B has all the routes, but to router A are passed only the routes that are connected or static of B.
routing bgp advertisements print 80.79.50.205 on router B shows all the routes, with 80.79.50.121 (which is the p2p between router B and C)
BGP debugging on router A reports:

00:29:12 route,bgp,debug,packet
00:29:12 route,bgp,debug,packet PathAttributes
00:29:12 route,bgp,debug,packet nexthop= weight=0 address=80.79.50.121
00:29:12 route,bgp,debug,packet bgp-origin=2
00:29:12 route,bgp,debug,packet bgp-aspath=(65000,65001,65014,65015,65016)
00:29:12 route,bgp,debug,packet bgp-aspath-len=5
00:29:12 route,bgp,debug,packet bgp-nexthop=80.79.50.121
00:29:12 route,bgp,debug,packet bgp-localpref=100
00:29:12 route,bgp,debug,packet
00:29:12 route,bgp,debug,packet NLRI=88.213.150.136/29
00:29:12 route,bgp,debug,packet Invalid NEXTHOP, ignoring NLRI
00:29:15 route,bgp,debug,packet UPDATE Message
00:29:15 route,bgp,debug,packet RemoteAddr=80.79.50.206
00:29:15 route,bgp,debug,packet MessageLength=59

From router A connected routes are correctly passed to router B...
Bye,
Ricky

changeip · Wed Aug 09, 2006 2:16 am

Is "set-nexthop=80.79.50.206" supposed to be on both routers filters, or was that a copy/paste issue?

You want to set-nexthop as they come in, not as they go out. Maybe this is why they are getting the following:

packet nexthop=<missing> weight=0 address=80.79.50.121

karyal · Wed Aug 09, 2006 9:35 am

Is "set-nexthop=80.79.50.206" supposed to be on both routers filters, or was that a copy/paste issue?

You want to set-nexthop as they come in, not as they go out. Maybe this is why they are getting the following:

packet nexthop=<missing> weight=0 address=80.79.50.121

I set it on the incoming filter only usually.
I tried to set it on both when setting on the in filter only didn't work.
Enabling o disabling the out filter makes no difference =(
Bye,
Ricky

Eugene · Wed Aug 09, 2006 11:25 am

Multiple peer entries are okay, but you cannot use more than 1 instance. If you use more than 1 instance you end up with multiple views of the routing table that do not 'see' other instances ... it's a problem and should not be that way.
...
Sam

This behavior is normal,it is not a bug. AFAIK, zebra and cisco work the same way.

Routes from multiple BGP processes are compared by kernel code.

Eugene

changeip · Wed Aug 09, 2006 6:28 pm

Does the kernel code not look at AS path length? Just trying to figure out why two routes with the same properties, but one with a longer AS path, are not using AS path to determine which is better. Maybe I'm just thinking routing tables are further along that they are : )

Sam

eflanery · Wed Aug 09, 2006 6:47 pm

IIRC, the kernel tables are quite simple, and would not take into account BGP (or other routing protocol) specific information. MT's "/ip route" tables seem to show more than I would expect the kernel to understand though.

Perhaps the best way to get AS-path-length to matter from different BGP instances, would be to translate them into something the kernel does understand, such as distance.

Perhaps a "/routing filter" chain created like this:

:for x from 0 to 50 do={
     /routing filter add chain=length-filter bgp-as-path-length=$x \
      set-distance=($x + 200) action=passthrough
}

Then, have each of the various input chains "match-chain" against length-filter.

Just a guess, I haven't actually tried it.

--Eric

karyal · Wed Aug 09, 2006 7:44 pm

Multiple peer entries are okay, but you cannot use more than 1 instance. If you use more than 1 instance you end up with multiple views of the routing table that do not 'see' other instances ... it's a problem and should not be that way.
...
Sam
This behavior is normal,it is not a bug. AFAIK, zebra and cisco work the same way.

Routes from multiple BGP processes are compared by kernel code.

Eugene

Personally do not mind if it's a bug or not, i'd just like to have an example of a working iBGP configuration with more thatn three peers, using confederation instead of a full mesh (i cannt scale on that, and of course if there's a full mesh all routes propagate).
Thanks,
Ricky

karyal · Thu Aug 10, 2006 9:06 pm

Seems 2.9.28 has the answer, with the force-nexthop option...
Bye,
Ricky

BelWave · Fri Aug 11, 2006 5:53 am

Seems 2.9.28 has the answer, with the force-nexthop option...
Bye,
Ricky

How is v2.9.28 working for BGP & OSPF? Any surprises? Are routes being added and removed properly via OSPF now?

Is the problem where a AS route is clearly displayed in the route table, but not working until disabled/enabled fixed?

Best,

Brad

karyal · Fri Aug 11, 2006 8:13 pm

Seems 2.9.28 has the answer, with the force-nexthop option...
Bye,
Ricky
How is v2.9.28 working for BGP & OSPF? Any surprises? Are routes being added and removed properly via OSPF now?

Is the problem where a AS route is clearly displayed in the route table, but not working until disabled/enabled fixed?

Best,

Brad

Can't tell yet... i'm still testing 2.9.28 before putting it in production..
So far, on three devices, no problems so far(i can' tell you anything on OSPF though)
Bye,
Ricky

karyal · Mon Aug 14, 2006 1:33 am

Can't tell yet... i'm still testing 2.9.28 before putting it in production..
So far, on three devices, no problems so far(i can' tell you anything on OSPF though)
Bye,
Ricky

After some testing i have upgraded some routers to 2.9.28
So far no problems, i'm finally able to have multiple paths to the two sides of my network.
The only problem i had was on one of the routers acting this way:

A --------B------- C
| |
----------D--------

When i enabled A and C peering to D the peering session started to disappear.. i.e. /routing bgp print started show nothing, and things started to mess up..
If i removed either A or C peering all was fine.. i re-enabled both peering, but filtered the prefix (now i get around 30 prefixes from each side, instead of around 800).
It seems to be stable... i use 32 megs RB532 ..
I it keeps to run stable i'll try to increase the received prefixes...
Bye,
Ricky

changeip · Mon Aug 14, 2006 1:59 am

Can't tell yet... i'm still testing 2.9.28 before putting it in production..
So far, on three devices, no problems so far(i can' tell you anything on OSPF though)
Bye,
Ricky
After some testing i have upgraded some routers to 2.9.28
So far no problems, i'm finally able to have multiple paths to the two sides of my network.
The only problem i had was on one of the routers acting this way:

A --------B------- C
| |
----------D--------

When i enabled A and C peering to D the peering session started to disappear.. i.e. /routing bgp print started show nothing, and things started to mess up..
If i removed either A or C peering all was fine.. i re-enabled both peering, but filtered the prefix (now i get around 30 prefixes from each side, instead of around 800).
It seems to be stable... i use 32 megs RB532 ..
I it keeps to run stable i'll try to increase the received prefixes...
Bye,
Ricky

Maybe this happens because you are receiving a route that is conflicting with a route to the other peer? If you learn a new route to your peer and accept it possibly it can't stay connected (async routing ?) or something. I am thinking that creating a routing filter chain that included all routes you do not want (your own) and then filter them on the incoming. - like default routes... To test this just setup a filter that marks all incoming routes with a routing-mark (or reject the route?), then look in that table to see what it received and see if there are any routes that might cause problems.

Sam

karyal · Tue Aug 15, 2006 4:32 pm

Maybe this happens because you are receiving a route that is conflicting with a route to the other peer? If you learn a new route to your peer and accept it possibly it can't stay connected (async routing ?) or something.

I don't really get what you mean for "conflicting routes".
The routes receive are the ones of the internal network (i have no external routes injected, except for the default route).
There are obviously duplicated routes received by the router, since you must think this router as the "closure" of a ring.
What i expect is the router to propagate this routes on the network, after it's addedd his as to the as path.
The problem anyway is not that it does not work this way, or that it "looses" routes.. the problem is that after a while if you do an /router bgp peers print the configured peers disappear (and the subsequent problem is that the bgp sessions are lost).
After a while the peers reappear, than disappear again...

I am thinking that creating a routing filter chain that included all routes you do not want (your own) and then filter them on the incoming. - like default routes... To test this just setup a filter that marks all incoming routes with a routing-mark (or reject the route?), then look in that table to see what it received and see if there are any routes that might cause problems.

Sam

Well actually i DO want my own routes.. the only point of the whole thing is to have multiple paths within my network to avoid failures if a point of the network goes down..
And, btw, if anyone used it in 2.9.28, can confirm a bug on route aggregations? I can aggregate a prefix, but summarization seems not to work.. the subnet keeps being announced even if summarization is turned on, instead of being suppressed..
Bye,
Ricky

karyal · Tue Aug 15, 2006 8:03 pm

Ok.. it's getting a little bit more annoying..
I have at least another rb532 around that is having problems with 2.9.28 and two peer (in this case the network is NOT ring-closed)

BGP session went down, i mac-telnetted to the device, and /routing bgp peer print...
it showed nothing... another /routing bgp peer print and the peers reappeared, and the bgp sessions came up again...

Bye,
Ricky

changeip · Tue Aug 15, 2006 8:06 pm

what does the memory situation on that 532 look like? Maybe its running low on mem and losing its brain? I've seen that happen even with 5-10mb free ...

Sam

Mapik · Tue Aug 15, 2006 8:42 pm

We upgraded our networks to 2.9.28 (about 110 routerboards) without problems. But now we see that ospf-out chain is not working. All routers have filter for ospf-in and ospf-out. I install one new today and I forget to set filter. I was shocked if I saw routing table, ospf-out chain is totally not working. Working well for us before upgrade... Can anyone help please ???

karyal · Tue Aug 15, 2006 11:55 pm

what does the memory situation on that 532 look like? Maybe its running low on mem and losing its brain? I've seen that happen even with 5-10mb free ...

Sam

I hrdly think it's a memory ,matter.. really, it does hang up this way even with just 70 or so prefix (which is MUCH less than it was i handled usually).
I thought it was a problema of handling the "closure" of the ring, but now that it's randomly happening even in routerboards where ROS from 2.9.23 to 2.9.26 has not shown any problem (more or less) is really making me think of a bug.
I'm doing some tests and it MAY be related to a couple of things:
The problem seems NOT to apper if the peers use an instance that is NOT the defult one
The problem seems to apper if the router ID is set and is an ip assigned to a point to point wireless link
Bye,
Ricky

karyal · Wed Aug 16, 2006 11:43 am

I'm doing some tests and it MAY be related to a couple of things:
The problem seems NOT to apper if the peers use an instance that is NOT the defult one
The problem seems to apper if the router ID is set and is an ip assigned to a point to point wireless link
Bye,
Ricky

It seems no =( now f the two boxes that had the problem one has no more, the second one still has it..
i've also noticed that instances disappear too (and come back after the second /ip bgp instance print)

bye,
Ricky

karyal · Wed Aug 16, 2006 3:23 pm

After some testing and turnin on bgp debug logging, i may have found the reason of the crashes in route aggregation..
so far the "crashing" session has been up for two hours, while before used to keep working for around 30 minutes.
I noticed by debugging that the last message before disconnecting was related to the announcement of a route aggregated by the device.
I turned off the aggregation (which btw does not seem to work correctly since does not summarize) and so far (cross my fingers =) it's up without problems.
This is also compatible with the facts that the problem is not shown on other devices on the network (the others do not aggregate).
I hope it's the one.. now, what i should do with MT guys? send a supout file tu support?
Thanks,
Ricky

BelWave · Wed Aug 23, 2006 7:23 pm

We upgraded our networks to 2.9.28 (about 110 routerboards) without problems. But now we see that ospf-out chain is not working. All routers have filter for ospf-in and ospf-out. I install one new today and I forget to set filter. I was shocked if I saw routing table, ospf-out chain is totally not working. Working well for us before upgrade... Can anyone help please ???

So, has anyone upgraded to v2.9.29 yet? Who is willing to test the waters first? <grin>

I see the change log is detailed as ever...<sigh> Who writes these entries for MT and why aren't they dated?

What does "fixed BGP attributes for static routes in routing-test" mean? Is this related to the static route disappearing act we are seeing with v2.9.27?

Thanks,

Brad

karyal · Wed Aug 23, 2006 7:34 pm

So, has anyone upgraded to v2.9.29 yet? Who is willing to test the waters first? <grin>

done... i have it on a couple of MT since yesterday.. seems no problems so far, i can tell you for sure the memory leak is not there anymore

Bye,
Ricky

karyal · Tue Sep 05, 2006 3:19 pm

So, has anyone upgraded to v2.9.29 yet? Who is willing to test the waters first? <grin>

done... i have it on a couple of MT since yesterday.. seems no problems so far, i can tell you for sure the memory leak is not there anymore

Bye,
Ricky

After some test 2.9.29 still gave me a bad week.
Peers and configurations do still disappear quite randomly.. yesterday we had half of the network loosing configurations , and reboot for about one hour.
No memory leak anyway, but i guess that something in the software + configuration causes some problem.. finding what seems quite complicated.. so far the only advice i've found is that things seems to have calmed down since we "broke" the network ring.
I have no automatic failover, but at least the network keeps running.

spirosco · Wed Sep 06, 2006 12:49 am

Unfortunatelly rt is still very unstablle.
I have installed it in about 10 routers and i'm watching it more than a week.
Almost every time a router lost a wireless link with one of his neighbors, when the link comes back again and the bgp session is re-established, i have noticed that one of the routers does not advertise his own prefix to the other one.

For example my router owns prefix 10.17.119.0/24.
One of my neighbors (5ghz link) owns prefix 10.87.183.0/24.

After a bgp disable/enable on my router, i have stopped to receive the prefix 10.87.183.0/24 from this neighbor
and started to receive it from an other neighbor via an alternative path of 3 or 4 hops.
Here it is just after disabling/enabling bgp on my router:

root@ns:~# tracepath 10.87.183.129
 1:  ns.spirosco.awmn (10.17.119.130)                       0.400ms pmtu 1500
 1:  ns2.spirosco.awmn (10.17.119.129)                      0.612ms
 2:  gw-spirosco.sw1hfq.awmn (10.17.119.198)                1.182ms
 3:  gw-sw1hfq.viper7gr.awmn (10.17.127.98)                 4.142ms
 4:  gw-tenorism.vlsi.awmn (10.17.122.173)                asymm  3   5.100ms
 5:  ns2.tenorism.awmn (10.87.183.129)                    asymm  2   4.621ms rea                        ched
     Resume: pmtu 1500 hops 5 back 2

And this is the correct trace, just after i have disabled/enabled the bgp problematic session:

root@ns:~# tracepath 10.87.183.129
 1:  ns.spirosco.awmn (10.17.119.130)                       0.276ms pmtu 1500
 1:  ns2.spirosco.awmn (10.17.119.129)                      0.657ms
 2:  ns2.tenorism.awmn (10.87.183.129)                      1.094ms reached
     Resume: pmtu 1500 hops 2 back 2

It's really frustrating because it seems that all other prefixes are exchanged normally
except from thoose that advertised from the actual router.
The main result of this is a really crazy asymmetric network as you can see in the trace

Is it a problem with bgp network command maybe? (I'm not using redistribute static from 2.9.28 and above).

Hey guys, i guess you have a lot of work to do

Next time it happens, i will send a supout.

Beccara · Wed Sep 06, 2006 4:01 am

What version you using?

.28 and .29 have had me up all night, did a upgrade run and it broke our Ibgp setup, moved to OSPF which was stable for about 20 hours untill it stopped updating routes, downgrade to .27 and Ibgp is fine again

spirosco · Wed Sep 06, 2006 12:16 pm

I'm using .29 in all routers.
I haven't manage to check so thoroughly .28 and i'm not sure if the same problem was there too.
I also tested redistribute static vs bgp network command, but it was the same.

spirosco · Wed Sep 06, 2006 8:18 pm

Well, i cathed a bgp crash on my router just few moments ago.
After the bgp crash guess what...my router have stopped to send my prefix in almost all of the bgp neighbors.
Only by disabling/enabling each of my bgp peer sessions, the problem has solved temporary i think.
All other bgp prefixes was exchanged normally.
I have allready send the precius supout. Please fix it guys

blabla · Thu Sep 07, 2006 10:11 am

Ospf in routing test in .28 .29 is very broken.
Also in - lets say - stable .27 ospf uses much more cpu than regular routing package, as much more as RB112 cant handle ospf (30-40 routers, 1000 /32 routes) with test package.

Thu Sep 07, 2006 10:14 am

Ospf in routing test in .28 .29 is very broken.
Also in - lets say - stable .27 ospf uses much more cpu than regular routing package, as much more as RB112 cant handle ospf (30-40 routers, 1000 /32 routes) with test package.

please send the support output file from the router to support@mikrotik.com

blabla · Thu Sep 07, 2006 11:28 am

Maeybe you just want to have access to rb112 with regular package and without?

I think .28 .29 were send to suuport, but I'll check it.

karyal · Thu Sep 07, 2006 11:42 am

Well, i cathed a bgp crash on my router just few moments ago.
After the bgp crash guess what...my router have stopped to send my prefix in almost all of the bgp neighbors.
Only by disabling/enabling each of my bgp peer sessions, the problem has solved temporary i think.
All other bgp prefixes was exchanged normally.
I have allready send the precius supout. Please fix it guys

Hi, next you have the crash (i hope not =) can you please check if th problem is similar to mine?
With the bgp crashed, do a /routing bgp peer print and see what happens.
In my case i see no peers, as if they were not configured at all.
If you do a /routing bgp peer print again the peers come back in place and all comes back up.
I've patched the situation at this time by using a scheduled script, it's very simple but seems to be effective.
If you want i can send it to you.

Bye,
Ricky

spirosco · Thu Sep 07, 2006 1:55 pm

karyal, no, visually all things look fine, bgp states are established, and routes are exchanged normally (as far i can tell).
There's no visual warning for what it happens, except if you look at Files for an autosupout.rif.

Thats my case were all the routers are classic pc's with a minimum of 256MB ram, and about 400 bgp prefixes.

karyal · Thu Sep 07, 2006 2:40 pm

karyal, no, visually all things look fine, bgp states are established, and routes are exchanged normally (as far i can tell).
There's no visual warning for what it happens, except if you look at Files for an autosupout.rif.

Thats my case were all the routers are classic pc's with a minimum of 256MB ram, and about 400 bgp prefixes.

I see.. my network is around 750/800 prefixes on RB532
As i understand you're redistributing routes internally via ibgp.
are you using full mesh or confederations?
Thanks,
Ricky

spirosco · Thu Sep 07, 2006 3:35 pm

You can see our network physical topology here: http://nagios.awmn.net/cgi-bin/statusma ... factor=1.0
Username/Password: awmn/awmn

We are using mainly pc's. Every network node has his own AS and they are speaking each other eBGP only.
There are some nodes with 2 or 3 routers that playing iBGP with OSPF, or they do all the routing stuff in a linux/quagga pc
and they use Mikrotik only for the wireless part (bridge).

nabuk · Fri Sep 08, 2006 2:49 pm

Upgradated to 2.9.30, routing filter for ospf-out doesn't still work !!!

karyal · Sun Sep 10, 2006 8:42 pm

You can see our network physical topology here: http://nagios.awmn.net/cgi-bin/statusma ... factor=1.0
Username/Password: awmn/awmn

We are using mainly pc's. Every network node has his own AS and they are speaking each other eBGP only.
There are some nodes with 2 or 3 routers that playing iBGP with OSPF, or they do all the routing stuff in a linux/quagga pc
and they use Mikrotik only for the wireless part (bridge).

Thanks.. i would like to ask you a couple of things, there are some situations in your network i'm unable to let work as should on mine and i would lik to understand where i'm wrong (or if with a slight change of my setuo, like adding a quagga box on the internal network too i can solve it)...
Thanks,
Ricky

Routing-Test

Routing-Test

about bgp

Re: about bgp

Re: about bgp

2.9.27 bgp routing-test

help

OSPF Queries / Bugs

Who is online