This was happening to us every 1-12 hours, extremely disruptive to the networkWe have this problem, but for us it happens every 30-90 days or so. It last happened 57 days ago. We have a ping watchdog to reboot the router when this happens. Disabling and re-enabling the interface might fix it too. Same CCR1036-8G-2S+, first generation. We have two CCR's connected to each other, one is PPPoE concentrator, the other not. The one that is not a PPPoE concentrator has no issues. Both run MPLS and OSPF.
We were soon going to be replacing the device with a CCR1072.
After changing Ports 1-4 to 5-8 on CCR1072 we never saw this problem again. Everything running smooth.That's not good to hear it still occurs......
I havn't touched the network topology and been considering changing it all back to how it logically should be, but if this is still happening today then no chance..... this is hugely service impacting
Think I lost 5 years of my life last time, not game to try again easily
Is it Hardware Rev 1 or 2? What SFP+ Modules you are using and what´s the signal? Maybe a temperature issue?RouterOS have a update for the problem? I'm on 6.48.4 with a CCR1036 and the problem happens every 30 days approximately. The router works with OSPF and EoIP Tunnels.
That´s what Mikrotik told .... 3 years and still countingI got an answer from Mikrotik about the issue with stopping traffic on interfaces.
"It does not look like a hardware-related issue. Seems some similar issues have been reproduced in our labs, when suddenly Tx traffic stopped on a physical interface and it is related to L2MTU handling on the device, we will try to improve this in further RouterOS versions, but at moment we cannot say any ETA.
In our tests, it seems like work around helps if you simply increase the maximum L2MTU on some interface (it can be even an unused interface) and then restore it to a default value. For example, try to enter these commands:
/interface ethernet set sfp-sfpplus8 l2mtu=10222
/interface ethernet set sfp-sfpplus8 l2mtu=1580
It will create a short link down on all interfaces and after this procedure this issue should not appear more.
If you reboot or upgrade your router, then you should follow the same procedure again until we include a improvements in further RouterOS versions.
Please share your feedback if this stops the interface hang.
"
Maybe this could solve the issue temporary.
I´ll try that if it happens again.
If I am reading your post correctly, the advice from MikroTik was to change the L2MTU to 10222, and then immediately change it back again to 1580? This is effectively the same thing as disabling/enabling the interface and would clear the problem.That´s what Mikrotik told .... 3 years and still counting
What I did was set the L2MTU to 10222 and then leave it that way. So far after nearly 3 weeks the problem has not happened again where it used to occur within 2 minutes of putting traffic on the link.
I think the point is that just changing the value of the L2MTU "do things" within the software and it will work until the next reboot. It doesn´t matter which actual value.If I am reading your post correctly, the advice from MikroTik was to change the L2MTU to 10222, and then immediately change it back again to 1580? This is effectively the same thing as disabling/enabling the interface and would clear the problem.That´s what Mikrotik told .... 3 years and still counting
What I did was set the L2MTU to 10222 and then leave it that way. So far after nearly 3 weeks the problem has not happened again where it used to occur within 2 minutes of putting traffic on the link.
Do you mind telling what features you are using on your network and if it is 100% MikroTik? Mine is mostly MikroTik but with some Juniper acting as a core/P router participating in OSPF/BFD/LDP but not BGP or VPLS.
There isn't really any valid reason I can think of not to max out the L2/bridging MTU on your network. You are only causing yourself potential problems later (see above).I was always wondering what is the benefit of setting L2MTU to some "tight" value instead of leaving it at maximum value, supported by hardware.
Riddle me this, then. If the value is not important, why does mine always trigger again within 30-120 seconds at 1580, but when setting the max value I am now at 3 weeks and counting of it not happening. I think there is more to it that just freshening up the memory or whatever you suggest might be happening flipping the MTU back and forth.I think the point is that just changing the value of the L2MTU "do things" within the software and it will work until the next reboot. It doesn´t matter which actual value.