CCR1072 watchdog reboot

cinatus · Mon Jun 12, 2017 7:08 pm

We have several CCR1072s in our core and in the last 2 days we have had 2 watchdog reboots. one with the router on 6.38.5 and the other with the router on 6.39.2. What could be causing it and what should I do to prevent it in the future.

berlo · Mon Jun 12, 2017 10:08 pm

We fighted too for same reboot. We're talking with the support and they suspect an hardware failure, but i can reproduce reboots always with same conditions and are:

- If you use traffic-flow with selected interfaces (i mean anything different then interfaces: ALL), we have continous reboot every 3-4 hours

- If you overclock CPU to 1200Mhz in some condition it reboot.

This is what i did:
- Forced the firmware upgrade (it was already 3.33 version, but i forced the reinstall). Revert back CPU from 1200 to 1000Mhz. Put interfaces to ALL in traffic-flow configuration with that values

> /ip traffic-flow print
enabled: yes
interfaces: all
cache-entries: 2M
active-flow-timeout: 2m
inactive-flow-timeout: 1m

With that changes i not experienced more reboots (i still monitor it has only two days passed, but prior i have more frequent reboot).

If you can, attach serial console and keep it open, this is what i saw after reboots:

MikroTik Login: (0,0) hv_warning: L2$ correctable data ECC error at PA 0xf8a8ff30
(0,0) hv_panic: got processor error: PC 0xffff_fff7_0051_e7c0, ICS/PL 0x6
(0,0) SBOX_ERROR: 0x0000_0000_0000_0000
(0,0) MEM_ERROR_CBOX_ADDR: 0x0000_0000_f8a8_fd78
(0,0) MEM_ERROR_CBOX_STATUS: 0x0000_0000_001c_0405
(0,0) L2 data ram 2-bit error detected.
(0,0) MEM_ERROR_MBOX_ADDR: 0x0000_0000_0000_0000
(0,0) MEM_ERROR_MBOX_STATUS: 0x0000_0000_0000_0000
(0,0) XDN_DEMUX_ERROR: 0x0000_0000_0000_0000

cinatus · Wed Jun 14, 2017 6:38 pm

Thank you very much for the input. I will see what that does in the next maintenance window.

Murmaider · Thu Jun 29, 2017 11:32 pm

I too am experiencing this.

We do however have our units overclocked to 1200Mhz... I wonder if that may be the issue.

berlo · Thu Jun 29, 2017 11:49 pm

Yes, we have downgraded to 1000Mhz and we not had more unexpected reboot

Murmaider · Fri Jun 30, 2017 9:17 am

Yes, we have downgraded to 1000Mhz and we not had more unexpected reboot

I'm going to give this a try, thanks a lot.

LynxChaus · Tue Jul 04, 2017 2:47 pm

Yes, we have downgraded to 1000Mhz and we not had more unexpected reboot

Same symptom as I reported here. Check PSU.

berlo · Tue Jul 04, 2017 2:58 pm

1036 is single PSU, 1072 is redundant. Also we experienced same issue on 3 different CCRs.

Is a different case

rvzweb · Thu Aug 17, 2017 9:42 pm

I have the same issue on CCR 1072 - cpu is 1000 MHz - fw is 6.40.1
Any ideas?

berlo · Thu Aug 17, 2017 11:19 pm

me too on lab router. Disabling watchdog we see the CPU goes to 100% due to networking process and router became unusable.

Downgrade to 6.40 fix the issue.

pamafer · Wed Nov 01, 2017 3:09 pm

I have the same problem, CCR1072 with 6.40.3, I only use BGP router mode, I do not use IP Traffic, the peak load never exceeds 6% of all CPUs, and the bandwidth never exceeds 1Gbps, anyway I suffer spontaneous reboots at any time, days can pass without problem and suddenly the Watchdog reboots the system. Any ideas?!

nocmonkee · Thu Nov 02, 2017 9:17 pm

We started experiencing these watchdog reboots on our CCR1072s running 6.40.1. Is this a reported bug? Does downgrading to 6.40.0 really stabilize it? Is it fixed in newer versions?

nov/02/2017 05:13:15 system,error,critical router was rebooted without proper shut
down, probably kernel failure
nov/02/2017 05:13:15 system,error,critical kernel failure in previous boot
nov/02/2017 05:13:15 system,error,critical router was rebooted without proper shut
down, probably kernel failure

berlo · Thu Nov 02, 2017 9:22 pm

hi,
at the moment we have 21 ccr1072 with 6.41rc44 all up with 17 days without issue. We do bgp + filtering + ospf. Nothing else. Try to upgrade to this release, if you still experience reboot you can exclude these service as reboot cause.

We experienced reboots with cpu upgraded to 1200Mhz, at 1000mhz never experienced it.

nocmonkee · Thu Nov 02, 2017 9:57 pm

We currently are not running any dynamic routing protocols. The purpose of our CCRs is NAT. They also manage dhcpd and upnpd with lacp, multiple rfc1918 vlans 20G in and one vlan 20G out.

pamafer · Fri Nov 03, 2017 12:34 pm

hi,
at the moment we have 21 ccr1072 with 6.41rc44 all up with 17 days without issue. We do bgp + filtering + ospf. Nothing else. Try to upgrade to this release, if you still experience reboot you can exclude these service as reboot cause.

We experienced reboots with cpu upgraded to 1200Mhz, at 1000mhz never experienced it.

Hi berlo, in the same way when we knew we had it in 1200Mhz of CPU we suffered much more frequent reboots but now we have two 1072 to 1000Mhz of CPU and suffers the reboots only one of them, it is the router that has less workload !!!. So it is a contradiction!

berlo · Fri Nov 03, 2017 1:17 pm

have you tried disabling whatdog and keep serial console opened?

You should see the error

pamafer · Fri Nov 03, 2017 1:48 pm

have you tried disabling whatdog and keep serial console opened?

You should see the error

I just tried to disable the watchdog, but not the console, I'll try it and comment! Thanks

berlo · Fri Nov 03, 2017 1:54 pm

You need to have console opened, because if is kernel panic or memory error or similar you can't see the error message

sakirozkan · Sat Nov 04, 2017 7:39 am

hi,
at the moment we have 21 ccr1072 with 6.41rc44 all up with 17 days without issue. We do bgp + filtering + ospf. Nothing else. Try to upgrade to this release, if you still experience reboot you can exclude these service as reboot cause.

We experienced reboots with cpu upgraded to 1200Mhz, at 1000mhz never experienced it.

It is still working without any issue. What is the uptime of ccr1072 now.

berlo · Sat Nov 04, 2017 1:29 pm

Yes and now ccr was raised to 28 in all Europe. All are working fine and we never experienced more random reboots. Also we experienced better performance on routes with > 1kk routes installed disabled route cache. You loose some % CPU, about 10% more, but you will not experiencing packetloss/stop forwarding when router will forward > 2mil pps

sakirozkan · Mon Nov 20, 2017 6:10 pm

is there any development of the watchdog reboot subject. What version are u using for solution. Our 1072 is reboots every 2 weeks.

mac86 · Wed Nov 29, 2017 11:55 pm

there is some update about this post ?

I've a ccr 1072 and get watchdog reboot and kernel panic reboot

this is about watchDog reboot

nov/29/2017 10:35:40 system,error,critical router was rebooted without proper shutdown by watchdog timer

and this is about kernel Failure reboot

nov/28/2017 10:33:45 system,error,critical router was rebooted without proper shutdown, probably kernel failure
nov/28/2017 10:33:45 system,error,critical kernel failure in previous boot

really fix it downgrading to 6.40 ?
it's about a hardware problem ?

In the meantime, I've checked routerboard clock on 1000mhz, disable watchdog, and I've disable traffic flow too.

thank you.

msbr · Thu Nov 30, 2017 1:26 am

We have several CCR1072s in our core and in the last 2 days we have had 2 watchdog reboots. one with the router on 6.38.5 and the other with the router on 6.39.2. What could be causing it and what should I do to prevent it in the future.

Hi
I have same problem with CCR1019.
So I disable whatdog

Enviado desde mi iPhone utilizando Tapatalk

Murmaider · Mon Dec 04, 2017 1:25 pm

Yes and now ccr was raised to 28 in all Europe. All are working fine and we never experienced more random reboots. Also we experienced better performance on routes with > 1kk routes installed disabled route cache. You loose some % CPU, about 10% more, but you will not experiencing packetloss/stop forwarding when router will forward > 2mil pps

If I am reading this right, is fastpath only reducing the CPU by 10%?
Is the benefit of the disabled route cache better than using fastpath?

berlo · Mon Dec 04, 2017 1:32 pm

the CPU usage in fastpath is always lower, so on high normal traffico fastpath still the only one solution.

But if without fastpath the ccr can handle the traffic, you can keep route cache disable that will help under ddos where you experiencing stop packet forwarding.

Murmaider · Wed Jan 24, 2018 6:11 pm

@berlo - did you need to reboot after you disabled route cache?

berlo · Wed Jan 24, 2018 9:37 pm

No, is on the fly. All changes can be done without reboot. The only issues are dummy rules that are not removed automatically, but need to reboot it to reactivate fast path

Thu Jan 25, 2018 7:04 am

If you have a router which reboots itself, then conversation in forum will only be a guessing game. If you want to find out for sure what is the problem, then send supout file (generated after reboot) to support@mikrotik.com. We are the only ones who can tell what the problem was. Tracing the cause of the issue might take a while, but still it is the best solution how to get rid of the problem.

In case of Watchdog reboots - they are caused by software. Basically router says to itself "/system reboot" at the point when router becomes inaccessible. In order to trace an issue you have to disabled Watchdog and now router either will reboot or get stuck. In both cases after reboot (either router rebooted itself or you have to power cycle it) generate new supout file and again - send it to support@mikrotik.com

buset1974 · Sat Jan 27, 2018 7:53 pm

i also have the same problems,

mikrotik also have the same request to us to disable the watchdog and try to send supout file white the problem happen, well it's make sense but we have to accept risk the device wont reboot or freeze until we cycle the power and with will cause very long downtime.

but i just realize so many people having problem with me.

we already replace the device with the same type twice and both having the symptom, so it software problem i guests
We have many CCR1072 operate but only 1 device with this particular configuration that having problems.
the one that have problem is run internet BGP full route both ipv4 & ipv6 and running on 1000 mhz.cpu.

We have other CCR1072s even running more complicated configuration like MPLS with OSPF + BGP running, but all running fine.

thx

buset1974 · Sun Jan 28, 2018 4:20 am

No, is on the fly. All changes can be done without reboot. The only issues are dummy rules that are not removed automatically, but need to reboot it to reactivate fast path

hi @berio,

sorry i just deep read your thread, i am using version 6.41 (stable) but still have randomly reboot.
so the DDOS packet caused the problem? i also have suspicion about this ddos things , because we having this similar problem around 1 year ago when one of our site having DDOS , not big though just around 50-100 mbps but it's caused the router keep reboot. At that time mikrotik give us solution to change all the interface queue type from "only-hardware-queue" to default or default-small and it's solved for view months after i change the queue to "only-hardware-queue" again.

so if i'am not wrong you suggestion to disable route cache right?
what is router cache anyway? if i disable the route cache is there will be other impact regarding performance? because the router running for about 300.000 routes (bgp) inside?

thx

buset1974 · Sun Jan 28, 2018 4:29 am

have you tried disabling whatdog and keep serial console opened?

You should see the error

Hi berio,

if i tried this (put serial on and disable watchdog), when freeze can i still reboot the router after see the error and capture the supout file?
because the router is not on 24 hrs man on site.
but i can arrange a pc with other link so when it happen i can still remote the pc.

thx

berlo · Mon Jan 29, 2018 5:07 pm

hi,
route cache should not cause reboot, but stop on packet forwarding. If your device have enough performance to forward traffic in slow path you can try disable the cache. You will see CPU usage increasing.

If you got kernel panic you need to hard reboot the router, so you need a managed pdu or someone that power cycle it.

buset1974 · Mon Jan 29, 2018 6:12 pm

hi,
route cache should not cause reboot, but stop on packet forwarding. If your device have enough performance to forward traffic in slow path you can try disable the cache. You will see CPU usage increasing.

If you got kernel panic you need to hard reboot the router, so you need a managed pdu or someone that power cycle it.

hi Berlo,

yes i've disable the cache and it's still rebooting.
Tonight i will set cache enable again and try to change all interface queue to default-ethernet

thx

BMena · Tue Jan 30, 2018 1:47 pm

I was also having my CCR 1072 rebooting, but mine was "without proper shutdown" instead of "by watchdog".
Sent a e-mail to support@mikrotik and he said he thinks the hardware is faulty..

But then another CCR 1072 I have, about 300km away, showed the same problem.

My solution for both was disconecting one power supply.

Before unpluging, I noticed under system health that while it was reading PSU1 and 2 voltage of 12.1v, only one was outputing Current.

Still can't test the supposedly bad PSU I removed cause I don't have a spare 1072 atm.

I have a third 1072 working fine where both PSU share the outputed current.

buset1974 · Wed Jan 31, 2018 5:36 pm

i set cache enable again and try to change all interface queue to default-ethernet around 2 days and the device still randomly rebooting.
so many people having similar problem and mikrotik still did not have any clue
CCR1072 is mikrotik most expensive RB and premium router type though, they should take some action regarding this issues

hx

buset1974 · Thu Mar 08, 2018 1:17 pm

I was also having my CCR 1072 rebooting, but mine was "without proper shutdown" instead of "by watchdog".
Sent a e-mail to support@mikrotik and he said he thinks the hardware is faulty..

But then another CCR 1072 I have, about 300km away, showed the same problem.

My solution for both was disconecting one power supply.

Before unpluging, I noticed under system health that while it was reading PSU1 and 2 voltage of 12.1v, only one was outputing Current.

Still can't test the supposedly bad PSU I removed cause I don't have a spare 1072 atm.

I have a third 1072 working fine where both PSU share the outputed current.

well is it solved now? by disconnecting 1 psu?

I also have the same answer, they said it hardware problem, i told them we have tried on 3 routers and all have the same issues.
so if it's an hardware issues mean all of my router have to be RMA, and also hundred of thousand other router in all over the world, so it means it's factory failure and they have to fix it.

thx

mrz · Thu Mar 08, 2018 1:31 pm

Upgrade to v6.41.2

Then upgrade bootloader to v6.41.2
/system routerboard upgrade

And reboot twice.

This should fix kernel crashes that previously was thought as hardware failure. Also it could fix other abnormal router behavior.

buset1974 · Thu Mar 08, 2018 1:43 pm

Upgrade to v6.41.2

Then upgrade bootloader to v6.41.2
/system routerboard upgrade

And reboot twice.

This should fix kernel crashes that previously was thought as hardware failure. Also it could fix other abnormal router behavior.

right now i'am using 6.41.2, i've tried all the firmware from 6.41.2 and mikrotik gave me also 3.42.5 and 6.99.
but the problem still exists.
i've sent all the supout from all of the firmware version.

btw are u maris from mikrotik?

thx

BMena · Sat Mar 24, 2018 2:48 pm

Could you find a fix? After 20 days my CCR rebooted out of the blue again...
So it becoming stable after removing only 1 PSU was just coincidence. Put the PSU back in place, better to have it rebooting than risking it stopping.

mac86 · Sat Mar 24, 2018 8:27 pm

Could you find a fix? After 20 days my CCR rebooted out of the blue again...
So it becoming stable after removing only 1 PSU was just coincidence. Put the PSU back in place, better to have it rebooting than risking it stopping.

have you upgraded your hardware routerboard, with last RouterOs version?
(not just router OS)

/system routerboard upgrade

doush · Sat Mar 24, 2018 11:27 pm

We will deploy a CCR1072 soon on our NOC.
Does all CCR1072 s have the same problem ?

Should we postpone the deployment ?

BMena · Tue Apr 10, 2018 3:04 pm

Could you find a fix? After 20 days my CCR rebooted out of the blue again...
So it becoming stable after removing only 1 PSU was just coincidence. Put the PSU back in place, better to have it rebooting than risking it stopping.
have you upgraded your hardware routerboard, with last RouterOs version?
(not just router OS)
Code: Select all
/system routerboard upgrade

Yup, I did.

We will deploy a CCR1072 soon on our NOC.
Does all CCR1072 s have the same problem ?

Should we postpone the deployment ?

Unless you're in real need. I have three, one each city, the reboots aren't awfully frequent but they happen.
The one I have that doesn't reboot have a bug with the ping tool, sometimes it stops working at all for some hours.

Wibernet · Tue Apr 10, 2018 4:40 pm

We are having this exact same issue, our CCR1072 randomly reboots.

Did the firmware upgrade fix this issue?

I must say I cannot tell people to avoid this device more. Since installing we have had major issues with latency spikes until we removed all NAT (Worked perfectly in CCR1036) we have had these random reboots disconnecting all of our clients for 10 minutes randomly. Nightmare.

doush · Tue Apr 10, 2018 10:10 pm

OMG !
We were thinking about replacing our core 1036 with 1072 but it seems that there are major issues with it.

sakirozkan · Mon Apr 23, 2018 6:08 pm

Is there any development with new versions. I use 6.38.7 for this reason.
We want to upgrade 1072 for "Vulnerability exploiting the Winbox port" Anyone use 6.42 version with 1072 without watchdog reboot error.

heddita · Tue May 08, 2018 12:24 am

We have this problem happening as well. It got worse recently so I came across this thread. It was previously on version 6.38.5, then up to 40.1 and now it's at 42.1. I just changed the CPU speed to 1000Mhz. I don't know why it comes at 1200Mhz if it causes issues. So far it's been up for 1:40 so not a lot of time has gone by... We've had about 5 unexpected reboots have happened today. Two of them caused the router to just hang completely. 1036's only have 2 10g ports and I need 3 ;_; I'm going to move the NAT to another box if it reboots itself again. This router literally only has OSPF running with a few vlans + NAT. At the time it rebooted, it was early morning so the traffic at the time was low, around 300mbps... It's not like it's using it's full force is what I mean. Well, I lit a candle, we'll see how it goes.

berlo · Tue May 08, 2018 12:46 am

The one I have that doesn't reboot have a bug with the ping tool, sometimes it stops working at all for some hours.

Disable route cache, it fixed for me.

ElviN · Tue May 15, 2018 1:34 pm

Hi!
Read the topic....I'm shoked....
I laughed frankly when i read it: "We are the only ones who can tell what the problem was".
We have been in correspondence with support for almost a year now about the gaps in arbitrary reboots of the device... We are sending many supout.rif files. And what a requests of support: "have a problem with power, DDOS, changed CPU".
We disabled whatchdog, changed power suply, we have 2 CCR1072. The last thing we did, we getting new device in our resseller and what can we see,(i think you guess) his arbitrary rebooted again and again....
Guys, are you seriously? A lot of your clients have the same problem, and you know about this and don't recognize the problem.

We are bought your top device and you say to reduce the performance by almost 20% (CPU downgrade). It's abnormal...

I do not recommend buying this device

VagnerBecker · Tue May 22, 2018 5:21 pm

Hey, guys

After upgrading CCR1072 and CRS317-1G-16+ to version 6.42.2, the kernel crashes have stopped. I hope they have really solved this great problem.

Best Regards,

Vagner Felipe Becker

kos · Tue Jul 10, 2018 10:45 am

Hey, guys

After upgrading CCR1072 and CRS317-1G-16+ to version 6.42.2, the kernel crashes have stopped. I hope they have really solved this great problem.

Best Regards,

Vagner Felipe Becker

Unfortunately they are not! Neither in ROS 6.42.2 nor in any other ROS version!!!

CCR1072 is not capable to handle 1Gbps bidirectional IMIX traffic and 25k sessions with conection-tracking enabled, due to watchdog timer reboots!!!

Five to ten times cheaper devices like CCR1009 and RB1100AHx4 are working fine in same test conditions!

The worst thing is that they are continuing selling this faulty device!

Contacting Mikrotik support is a lost of time!

ElviN · Mon Jul 30, 2018 11:34 am

Hey, guys

After upgrading CCR1072 and CRS317-1G-16+ to version 6.42.2, the kernel crashes have stopped. I hope they have really solved this great problem.

Best Regards,

Vagner Felipe Becker

No, device rebooted again with kernel error fail.
Problem not solved

Jeanluck · Thu Aug 09, 2018 9:50 pm

I have a CCR1072 with RouterOS 6.38.3 and it works perfectly. I wanted to update it to 6.42.6 but reading this thread I'm afraid...
Has anyone ever tested if the problems go away with 6.42.6?

Can anyone tell me if these problems exist only after version 6.38? (I need to know if the problem is determined by the RouterOS version or by the configuration, in which case I understand that I should already have problems)

tduchch · Tue Oct 09, 2018 5:16 am

These random reboots are becoming very annoying. Looking for a replacement solution, but have not found one with at least 3 SFP+ ports. We do a lot of MPLS and VPLS on this box, have not been able to use CHR on either VMWare or Hyper-V.

Recently upgraded to 6.42.9 Long Term, and still get random watchdog reboots. We have a serial logging device connected to the console port, and there is no output before the reboot. Of course, nothing in the logs.

Has anybody received any support from Mikrotik on this issue?
Any other hardware that someone has used in place of the 1072?

Terry

Jeanluck · Tue Oct 09, 2018 1:33 pm

Works fine for me with 6.38.3, can you use this version?

Tue Oct 09, 2018 6:40 pm

There are multiple possible reasons why router is rebooting:

1) Hardware problem;
2) Software issue;
3) Overloaded device.

All of these three possible issues can be masked under log messages "rebooted due to Kernel Failure" , "rebooted due to Watchdog timeout", etc.

There is no reason to believe that issue, that you have on router, is the same issue as other user has just because log messages are equal. Only reason why you could believe that issue is the same is if problem appears for the first time at the same RouterOS version for multiple devices or you know how to trigger it.

In any other case contact MikroTik support. Only support has ability to see crashes and help you in order to diagnose issue and resolve the problem in its roots.

Multiple times we have seen reports "when you will fix Kernel Failure bug". This is equivalent with "when RouterOS will be a bug free version". There is no bug free software, but you can resolve your problem only by cooperating with MikroTik support. It is also the fastest way how to reach solution, after an upgrade to the latest version (if problem is already fixed by MikroTik).

Tue Oct 09, 2018 6:46 pm

If you have CCR device that is crashing, then I recommend that you make sure that latest RouterBOOT version is installed on your router. v6.42 includes latest firmware for TILE devices which could help due to this fix:

!) tile - improved system performance and stability ("/system routerboard upgrade" required);

tduchch · Tue Oct 09, 2018 7:06 pm

Yes we are running latest long term version 6.42.9
We have had this problem in several versions.
We have swapped out different hardware
We have opened support ticket, were told the only way to trouble shoot this was to turn off watch dog and see what appears on the console port.
The router is a two hour drive away, and is a core router for all our network, letting it lock up is not an option.
As stated, we have a logging device on the console port and nothing appeared before the reboot.
The device can reboot during high traffic, or very low. No pattern exists.
CPU load at high traffic is around 25%, profile has 10% Firewall and 10% network.
CPU does not seem to spike before reboot.
High traffic is just approaching 2Gbps, should be no problem for this device.
Sometimes it runs for days, sometimes only for hours.

I will open another ticket with support, hopefully this can be figured out.

kos · Mon Oct 15, 2018 1:50 pm

Hi strods,

as I mentioned earlier, CCR1072 is not capable to handle 1Gbps bidirectional IMIX traffic and 25k sessions with conection-tracking enabled.
To achieve reboot there is no no need of any other configuration, except two IP addresses, two routes and connection-tracking enabled.

kos · Mon Nov 19, 2018 10:35 am

Your traffic is distributed over two interfaces. That is why you are not experiencing more frequent reboots.

I am speaking about situation where two interface are loaded to 1G in every direction (1Gbps bidirectional forwarding). So 1Gbps entering sfp+1 and going out trough sfp+2, and in the same time 1Gbps is entering sfp+2 and going out trough sfp+1.

doush · Fri Nov 23, 2018 10:39 am

Frequent reboots EVERY DAY !
"Router was rebooted without proper shutdown by watchdog timer"

This issue is still not resolved and expect an answer or a possible reason from Mikrotik.

Mikrotik Support is pretty much less useless as of now.

As I say, it happens every EVERY DAY and mikrotik is silent about the issue. This behavior is not right.

CCR1072 is a useless device as it has this major fundamental flaw and should be avoided for anyone who are considering about using it in any deployment.

An reply from Mikrotik would be nice.

whoknew · Mon Nov 26, 2018 3:53 pm

We are having the same issue with the watchdog timer reboot.

disabling the watchdog causes the router to lock up until power is taken away. Full reboot with pulling power cables from both Power supplies.

We are looking for a solution to this. Our CCR1036 that this replaced did not have this issue. We are @ less than 10% load always, less than 2Gbps/300Mbps and only have static routes on this router as it is our core.

bradnz · Wed Nov 28, 2018 9:35 am

Hi all.

I have a CCR-1072 on 6.43.2 and this error was in the log "critical router was rebooted without proper shutdown by watchdog timer". I have sent the SUPOUT log to mikrotik and they are saying the following:

"For debugging you should turn off watchdog and test again.
"/system watchdog set watchdog-timer=no"

Connect now to your router over serial console, make sure that you have accessed RouterOS command line interface and leave console running.

Now router either:

1) might be stuck (freeze). If that happens, then you have to generate supout file on the router through serial console. Now you can reboot device.

2) might be stuck (freeze) and become unavailable over serial console. If that happens, then reboot router and then generate supout file.

3) will reboot. If that happens, then after reboot generate supout file.

Send supout file and full serial console output (within text file) to us for investigation."

This is a production router that I have already swapped for our cold spare in the hope it was just a power supply / hardware issue, but it would appear, based on this, that its not. I can let it just freeze because its in a data centre 20 mins from the office. 20mins of downtime would be unacceptable to our customers. What are your thoughts? Should I just replace it with a 1036, or a 1009? Are these any better? It runs BGP, only have about 300 BGP routes, runs at about 9% CPU most of the time, and has almost all its memory available. It just cant be that hard for this router to do surely.

Any suggestions would be appreciated.

sch · Wed Nov 28, 2018 9:39 am

At first upgrade your CCR to latest RouterOS and firmware version. It may help.

kos · Wed Nov 28, 2018 11:28 am

Upgrading will not help! 1009 is better choice then 1036.

bradnz · Thu Nov 29, 2018 5:55 am

I have upgraded to 6.43.4. I doubt it will help.

I actually wondered if this had something to do with the number of connections and the router not being able to handle it.

Either way, Im looking for an alternative hardware device now. I tried to change to the 1009 last night, but Im concerned about the performance, and also I use SFP+ that arent compatible with it. The interfaces never came up.

Incidentally, if I wanted to trial my theory of the number of connections does anyone know of tool I can use to do this?

kos · Thu Nov 29, 2018 11:16 am

RouterOS traffic-generator.

Don't be an optimist, because Mikrotik support claims "There are no hidden fixes"

whoknew · Fri Nov 30, 2018 8:24 pm

We are on 6.43.4 and the problem still exists. If you disable the watchdog timer, the router will hard lockup, all lights on and you will have to pull both power supplies.

Jeanluck · Fri Nov 30, 2018 10:11 pm

I had to change the CCR1072 because I just couldn't find a solution. New hardware and everything solved, with the same .backup of the unit that failed.

Do you have the option to change the hardware and check if it fixes the problem?
(remember to do a reset-macs if you load the .backup)

JULIOLIMA · Wed Dec 05, 2018 2:20 pm

A few weeks ago we are going through the same problem, several restarts during the day, causing various disruptions, dissatisfaction and cancellations, we have replaced 1072 by another we have the settings in the zero hand, no procedure solved the problem, Mikrotik support so far, only speculation and no definitive solution ...

whoknew · Thu Dec 06, 2018 3:32 pm

There is a clear problem with the CCR 1072. Mikrotik we need refunds for these units or for the problem to be addressed. Please issue a statement here on the forum.

Jeanluck · Thu Dec 06, 2018 5:46 pm

I use the 6.38.3 without problems, except high cpu for pppoe disconnections (no nat, no masquerade, no reason...)
Precisely I don't update for fear of what I read is happening...

guipoletto · Thu Dec 06, 2018 9:13 pm

I use the 6.38.3 without problems, except high cpu for pppoe disconnections (no nat, no masquerade, no reason...)
Precisely I don't update for fear of what I read is happening...

That is probably Conntrack clearing the connections table after the PPPOE disconnections. it can be very disruptive indeed.
Are ALL your PPPOE connections NAT'ed?

> if you don't use NAT at all, you can disable Conntrack
> if you only need NAT for part of the connections, you can try to create "action=no-track" rules in the RAW table of the firewall, to bypass Conntrack.

As it's a bit off-topic, you should open a new topic if you want to dig further in the conntrack hole.

whoknew · Mon Dec 10, 2018 6:08 pm

Our CCR1072 has around 2Gbps going through it currently (about to go up to 4Gbps). Constant watchdog reboots, if I disable watchdog, I can only recover the CCR1072 by pulling power from both power supplies. Our setup is as follows:

6.42.6 is the firmware we are on.
We have 38 static routes (nothing dynamic)
No NAT rules.
No Queue rules.
No PPPoE
3 Firewall rules, 1 of which blocks winbox, ssh and telnet. 2 allow our internal subnets.
No Mangle rules.
2 SFP+'s only.
Less than 10% CPU usage.

We have a CCR1036-8G-2S+ and it does not reboot with the same configuration as we had it in place and running with an uptime of 372 days prior. I will be going back to our CCR1036 again in the middle of this month, a weekly lockup and reboot is uncalled for.

doush · Tue Dec 11, 2018 1:22 am

It is time for Mikrotik to seriously consider refunds of these CCR1072 units.
$3000+ for a router which can not hold even straight 3days of uptime is ridiculous.

Murmaider · Tue Dec 11, 2018 3:06 pm

Ours which had been running on version 6.38 for months and months without any issues, was upgraded to 6.42.9 a month ago and today it randomly rebooted with no clear reason why.

Jeanluck · Tue Dec 11, 2018 3:39 pm

That confirms what I thought.... I will not move from the 6.38.3 that works perfect! In the 6.42.1 changed the firmware to improve performance for CCR1072, I don't know if it will have anything to do ...

doush · Tue Dec 11, 2018 3:55 pm

v6.38 is vulnerable.
We cant use that

Jeanluck · Tue Dec 11, 2018 4:35 pm

Closing 1072 well with IP/Services with subntest + ip firewall, is protected. If I have to choose between restarting it every day by watchdog or protecting it well manually....

doush · Wed Dec 12, 2018 1:57 pm

Would be good to have a special 6.38.x version from Mikrotik with security patches applied.
So at least we can try it out.

whoknew · Wed Dec 12, 2018 4:56 pm

Can we get a Mikrotik response please. The 1072 is ideal with the redundant PSU's but I cannot be rebooting it once or more a week.

At the very least like Doush said, give us a v6.38.xx that is patched. specifically for the CCR1072.

Murmaider · Thu Dec 13, 2018 12:54 pm

Can we get a Mikrotik response please. The 1072 is ideal with the redundant PSU's but I cannot be rebooting it once or more a week.

At the very least like Doush said, give us a v6.38.xx that is patched. specifically for the CCR1072.

I 2nd this, otherwise I'm just going to downgrade to 6.38.7 and firewall off the security issues.

ElviN · Mon Dec 17, 2018 11:45 pm

Hello!
People, don't upgrade to 6.43.7!!!!
After upgrading we getting freezing of device third time in the last two days...

rime · Tue Dec 18, 2018 1:49 am

Hi,

uptime: 13w2d23h16m46s
version: 6.43 (stable)
build-time: Sep/06/2018 12:44:56
free-memory: 14.8GiB
total-memory: 15.8GiB
cpu: tilegx
cpu-count: 72
cpu-frequency: 1000MHz
cpu-load: 2%
free-hdd-space: 76.5MiB
total-hdd-space: 128.0MiB
architecture-name: tile
board-name: CCR1072-1G-8S+
platform: MikroTik

No problem here.

Deywid · Fri Dec 21, 2018 12:51 am

Unfortunately the CCR1072 has a chronic problem that has not been resolved until now.
The Mikortik team does not position or make any statement about the situation.
3 units in production CCR1072 with the same symptoms when doing BGP using IPV6 restarts sporadically every day by the watchdog
If the watchdog is not active, it freezes and only works after the two sources are turned on.
It does not matter the updated version or not.

We can cite the same with the nickname of (1072 of frozen)

Frozen kill

Infelizmente o CCR1072 tem um problema crônico que não foi resolvido até agora.
A equipe Mikortik não posiciona ou faz qualquer declaração sobre a situação.
3 unidades em produção CCR1072 com os mesmos sintomas ao fazer BGP usando IPV6 reinicia esporadicamente todos os dias pelo watchdog
Se o watchdog não estiver ativo, ela congela e só funciona após deligar as duas fontes.
Não importa a versão atualizada ou não.

Podemos citar o mesmo com apelido de (1072 do congelada)

Congeladas matar

matheusazevedo · Fri Dec 21, 2018 5:41 pm

Hello guys, I have the same problem here with a CCR1036-8G-2S+. Anybody else?

bradnz · Wed Jan 02, 2019 10:37 pm

I 2nd this, otherwise I'm just going to downgrade to 6.38.7 and firewall off the security issues.

Did you end up doing this and has it restarted since? Im about to do this. I had upgraded to 6.43.4 and it was fine for about 28 days, then it restarted twice in the space of 2 days. I simply cant turn off Watchdog to get the required log details - the data centre this is in is a 30mins drive to get there, and our customers wouldnt be happy at all with that amount of downtime.

If there is hope downgrading the FW will sort this out I would rather just do that.,

Deywid · Sun Jan 06, 2019 7:48 pm

We are still waiting for mikrotik's official position on the CCR1072 freezes without any solution or information.

Complete wrapping of your best product, fix, worst product.

bradnz · Mon Jan 14, 2019 11:08 pm

I tried to downgrade to 6.38.3 and 6.38.7 and in both instances I had to recover the device from a boot loop using netinstall - the kernel wouldn't be found as it couldn't mount the drive. I have now upgraded to 6.43.8. Such a pain in the ass and I have now got really pissed off customers.

If this upgrade doesn't do anything to help, I'm probably going to start the process of getting a refund for the devices from our distributor and look at an alternative vendor. Its just ridiculous.

Deywid · Tue Jan 15, 2019 7:19 pm

SOLVED

We get after a lot of work and trouble solving the problem of CCR1072 no longer restart or freeze.

Solved the problem was relatively easy, after more than 3 months waiting for the mikrotik team to position on the problem, solved.

We traded the CCR1072 for an MX-80 Juniper.

whoknew · Tue Jan 15, 2019 11:13 pm

SOLVED

We get after a lot of work and trouble solving the problem of CCR1072 no longer restart or freeze.

Solved the problem was relatively easy, after more than 3 months waiting for the mikrotik team to position on the problem, solved.

We traded the CCR1072 for an MX-80 Juniper.

So....you went from a $2,000 USD router to a $35,000 USD router?

Wed Jan 16, 2019 12:51 am

SOLVED

We get after a lot of work and trouble solving the problem of CCR1072 no longer restart or freeze.

Solved the problem was relatively easy, after more than 3 months waiting for the mikrotik team to position on the problem, solved.

We traded the CCR1072 for an MX-80 Juniper.
So....you went from a $2,000 USD router to a $35,000 USD router?

A MX80 is not a $35000 router! Where the hell do you buy your gear from ?

kos · Wed Jan 16, 2019 11:16 am

CCR1072 price is 3050$!

Deywid · Fri Jan 18, 2019 8:07 pm

That's not all.
Mx-80 = $ 6,700
CCR1072 = $ 3,000

When the plane is falling, paying double to survive is acceptable.

Kevo · Fri Jan 18, 2019 11:38 pm

For those whose 1072 is rebooting are you running at 1200MHz or 1000MHz. We are running the long-term/bugfix version and haven't had any reboots or lock ups. It has always been set to 1000MHz. It came that way as far as I know.

I'm wondering what feature or setting makes the difference. We have 3 interfaces in use and run about 3Gbps through it during peak times.

We are only doing routing OSPF, BGP, and a bit of basic firewalling.

Deywid · Sat Jan 19, 2019 1:26 pm

1000Mhz
BGP Com Ipv6 e Ipv4
OSPF
Firewall Basic
No full route
IBGP

kos · Mon Jan 21, 2019 1:44 pm

It not depends on features but traffic pattern and interface distribution.

Reboot requirements:
- connection-tracking activated
- clear routing between two interfaces
- 1G bidirectional traffic, 600B packets (each interface handles 1G in each direction)

If the packets are bigger:
- ~2,5G bidirectional traffic, 1500B packets

If the traffic is mostly unidirectional:
- ~2G unidirectional traffic, 600B packets
- ~5G unidirectional traffic, 1500B packets

Kevo · Tue Jan 22, 2019 8:57 pm

I've just completed removing any connection tracking from ours. We had some DDoS issues and from what I've read connection tracking can be a big problem in that case. So I've rearranged my firewall rules and adapted things to run in the raw table as much as possible and turned off connection tracking. So far things seem good, but no real test to speak of yet. We really didn't need connection tracking on this router it was just not something I really considered much until the DDoS stuff started.

Hopefully this will keep us out of trouble until MT can resolve the issue completely with a future update or something.

doush · Tue Jan 22, 2019 10:50 pm

We have 6Gbit/s on it.
CT is on.
We have moved most of the rules to RAW so around 450Mbit of traffic is currently NATed and processed by connection tracking table.
Rebooted again with a very small dDOS attack yesterday.

MT doesnt accept that there is a problem. So nothing will be fixed.
Check the beta thread.
viewtopic.php?f=21&t=139057&start=300#p707452

fr0zonza · Tue Jan 29, 2019 1:59 pm

We have a similar issue except it affects ALL of my CCR routers (1009-1036) at different locations at the same time.
Interfaces flap at the exact same time and sometime stay down until i disable and enable. In extreme cases some routers lock up completely.
On my LHG radios i get
"12:55:41 interface,warning wlan60-1: bridge port received packet with own address
as source address (xxxxxxxxxx), probably loop"

Has been going on for months and still no help from support.
We run a fully routed network so no chance of any loops.

cabijuan · Tue Feb 19, 2019 10:31 am

I have 3 CCR 1072 and in all the same thing happens to me, I'm pissed with Mikrotik, it can not be that your top-of-the-line router will pass this to you and do not say anything about it.

bradnz · Mon Feb 25, 2019 3:03 am

I've just completed removing any connection tracking from ours. We had some DDoS issues and from what I've read connection tracking can be a big problem in that case. So I've rearranged my firewall rules and adapted things to run in the raw table as much as possible and turned off connection tracking. So far things seem good, but no real test to speak of yet. We really didn't need connection tracking on this router it was just not something I really considered much until the DDoS stuff started.

I have just removed all NAT and Mangle rules, which means that CT is actually not operating on the device at all. its behind a Transparent Fortigate Firewall, so shouldnt be an issue anyway. I have moved NAT and PAT services to a CISCO 3925 router now, and havent yet seen a reboot. I wondered if the CT turned off would have actually just been the answer rather than having to install another router for this purpose of NAT / PAT. A bit annoying. Do you think if I just turned off CT it would have been ok? How have you found things have been since you did this?

kos · Mon Feb 25, 2019 10:34 am

Yes, when connection-tracking is disabled the device performs much better.

When I contacted Mikrotik support about device reboots they requested access to device. I provided them a test setup with one CCR1072 to act as traffic-generator and another one acting as DUT. No configuration on DUT, just routing and 4 FW rules to prevent unauthorized access.

After a few weeks of meaningless tests, the answer was - DUT is overloaded, you can't see it but it is.

With 1G?????? That is not even close to your test results published on your web site?????!!!!

- Our tests are performed with connection-tracking disabled!

That's all! You are ******!

I will say it again, CCR1009, RB1100x4 and RB4011 are working as expected in absolutely same test condition in which CCR1072 reboots it self!!!

cabijuan · Mon Feb 25, 2019 10:38 am

Another restart this Saturday, is desperate, I have the router at 400km, please Mikrotik say something about it.

cdemers · Mon Feb 25, 2019 4:28 pm

Email mikrotik support, they don't monitor the user forums for people having problems. Normus and a few others frequent here, but best just to email them with a support ticket and supout file.

Sent from my SM-A520W using Tapatalk

cabijuan · Mon Feb 25, 2019 5:32 pm

I always send it, and the only thing they tell me is update, I already have 5 different versions. It does not help to send it, they have no idea what the restarting does.

djdrastic · Wed Mar 27, 2019 11:54 am

Reading this thread as I have to upgrade a 1016 due to lack of SFP+ ports . Has worked 1y without skipping a beat barring the security updates.
Am I better off getting a EdgeRouter Infinity or building a cheap X86 box ?

doush · Sat Mar 30, 2019 4:34 pm

Dont upgrade to CCR1072. It is not a stable product to work with.
Frequent reboots etc..

whoknew · Thu Apr 04, 2019 4:53 pm

SOLVED

We get after a lot of work and trouble solving the problem of CCR1072 no longer restart or freeze.

Solved the problem was relatively easy, after more than 3 months waiting for the mikrotik team to position on the problem, solved.

We traded the CCR1072 for an MX-80 Juniper.
So....you went from a $2,000 USD router to a $35,000 USD router?
A MX80 is not a $35000 router! Where the hell do you buy your gear from ?

Where are you buying them from. Distributor list price all show $19,000+ USD. I have a friend who works for a larger ISP and they can get them for around $6,800 directly from Juniper.

wildbill442 · Tue Sep 17, 2019 12:56 am

Hey, guys

After upgrading CCR1072 and CRS317-1G-16+ to version 6.42.2, the kernel crashes have stopped. I hope they have really solved this great problem.

Best Regards,

Vagner Felipe Becker
Unfortunately they are not! Neither in ROS 6.42.2 nor in any other ROS version!!!

CCR1072 is not capable to handle 1Gbps bidirectional IMIX traffic and 25k sessions with conection-tracking enabled, due to watchdog timer reboots!!!

Five to ten times cheaper devices like CCR1009 and RB1100AHx4 are working fine in same test conditions!

The worst thing is that they are continuing selling this faulty device!

Contacting Mikrotik support is a lost of time!

Looks like I'm having a similar issue...

viewtopic.php?f=2&t=152192

Fluke · Tue Sep 17, 2019 6:45 am

Out of curiosity - what is the device temperature? (system health print)

I have a device that runs at ~53C and reboots occasionally, and another one that runs at ~39C and works OK.

Jeanluck · Tue Sep 17, 2019 11:17 am

My CCR1072 works at 37-39

cabijuan · Tue Sep 17, 2019 12:00 pm

forget about the temperature, it is a problem in the kernel that mikrotik does not know how to solve, nor does it pay attention to us. Hopefully with the V7 this is solved, it is the only hope I have left. the top-of-the-range router does this happen, it's nonsense

mac86 · Tue Sep 17, 2019 4:51 pm

We've downgraded CCR1072 to 6.44.5 and no more reboots.

Jeanluck · Tue Sep 17, 2019 6:08 pm

can anyone confirm that version 6.44.5 is stable with the CCR1072?

wildbill442 · Tue Sep 17, 2019 6:21 pm

can anyone confirm that version 6.44.5 is stable with the CCR1072?

We’ve just downgraded to latest long term, I’ll update in 48hrs.

Jeanluck · Tue Sep 17, 2019 6:36 pm

Please let us know how it works in a few days

cabijuan · Tue Sep 17, 2019 7:31 pm

I have 6.43.8, and I have reboots.

Jeanluck · Tue Sep 17, 2019 7:57 pm

Thanks, 6.43 discarded then... let's see if there's luck with 6.44.
I have the 6.38.3 and only had one reboot in 3 years

mac86 · Tue Sep 17, 2019 7:58 pm

can anyone confirm that version 6.44.5 is stable with the CCR1072?

Yes, I can confirm it.
4 weeks without reboots, and going on.....

Jeanluck · Tue Sep 17, 2019 8:16 pm

What reboot frequency did you have previously?

miltont · Tue Sep 17, 2019 8:34 pm

Up 60+ Days

wildbill442 · Tue Sep 17, 2019 10:49 pm

What reboot frequency did you have previously?

One to multiple times in a 24HR window.

Mikrotik support wanted me to turn off watchdog and log error / generate supout.rif once locked. The generated supout.rif files I provided after the watchdog reboot didn't yield any useful information. I opted to netinstall the latest long term release track (6.44.5). We also replaced all the SFP modules as a precaution during the maintenance window.

Mikrotik Support's Recommendation:

Before that, I suggest that you upgrade to the latest "stable" version (if there is an actual bug, then it might be already fixed). For example, v6.45.6 fixes a Watchdog reboot caused by h323 firewall helper. If your router did process voice call traffic, then the issue might be already resolved.

What's new in 6.45.6 (2019-Sep-10 09:06):
*) conntrack - improved system stability when using h323 helper (introduced in v6.45);

These are our edge routers and no NAT is being performed and we don't use h323 internally, but if some malformed h323 packet traversing these routers was causing the reboot then this may have been the cause behind the sporadic reboots. Again I'll post results of the downgrade to the Latest Long Term Release track after ample time has passed. So far ~15hrs and no reboot, this won't be conclusive until we make it past the 24-48hr mark. From what other users above have posted I'm feeling optimistic.

wildbill442 · Tue Sep 17, 2019 11:34 pm

Here's a sequence of events on how this issue started for us:

It began with a kernel failure on one of our edge routers, I'll call it EDGE A. To fix this we decided to do a netinstall on EDGE A and bring the router up to the latest stable release at the time (6.45.5) due to CVE fixes etc. Prior to the netinstall we were running 6.42.6 on both edge routers and other than the kernel failure on EDGE A we were not experiencing reboots. We also upgraded our other edge router, EDGE B, in the same maintenance window to 6.45.5. The reboots continued after the netinstall on EDGE A only the error changed from "reboot due to kernel failure" to "reboot by watchdog timer". At this time we were under the impression there was a hardware issue with the EDGE A router so we moved the BGP peer and other connections to EDGE B and overnighted new hardware. The reboots then started happening on EDGE B. We opened a ticket with Mikrotik support, but after going through the weekend and into Monday with no response and the routers rebooting sporadically we decided to netinstall the latest longterm release track (6.44.5). Mikrotik support responded after we were done with our downgrade and reaffirmed our suspicion that there was something wrong in software and to upgrade to latest stable release (6.45.6) due to a stability improvement relating to H323 and watchdog reboots. As we did not need any of the "new" features in the Stable Release tree, and receiving this information after downgrading to Long-Term Release tree, I think we're going to stay here unless the issue persists.

Again I won't know definitively if this was the root cause until ample time has passed, so I'll update after we get passed that 48hr mark.

hytanium · Tue Oct 08, 2019 5:05 am

We are moving over 8Gbps on our 1072 at peak and have started seeing this random reboot watchdog error. We replaced with another 1072 thinking it was hardware related...once again, same random issue. Sometimes we go days without a reboot, sometimes a couple reboots in a day. It has been very disruptive.

I have disabled the h323 service as someone pointed out it may be part of the issue. It does seem to be load or random packet related. Any insight would be helpful. We are on 6.45.6.

Jeanluck · Tue Oct 08, 2019 11:41 am

Could be some random attack? Try to register cpu usage every second looking for high cpu usage

kos · Tue Oct 22, 2019 4:59 pm

I wrote it before. It is clear:

Reboot min requirements:

- connection-tracking activated
- clear routing between two interfaces
- 1G bidirectional traffic, 600B packets (each interface handles 1G in each direction)

If the packets are bigger (1500B):
- ~2,5G bidirectional traffic

If the traffic is mostly unidirectional:
- ~2G unidirectional traffic, 600B packets
- ~5G unidirectional traffic, 1500B packets

Everyone who has a spare CCR device could do the test by using traffic generator.

Maggiore81 · Sat Dec 14, 2019 11:31 am

hello.
you said that removing CT frees CPU, correct.
but in a situation where you dont have any queues, just 3-4 NAT rules, you need CT on.
I have enabled fasttrack and the CPU load is very low even with 3gbit traffic.
I do just BGP (not FRT), some RAW rules, and it works flawlessy.

sakirozkan · Thu Jan 16, 2020 2:29 pm

I wrote it before. It is clear:

Reboot min requirements:

- connection-tracking activated
- clear routing between two interfaces
- 1G bidirectional traffic, 600B packets (each interface handles 1G in each direction)

If the packets are bigger (1500B):
- ~2,5G bidirectional traffic

If the traffic is mostly unidirectional:
- ~2G unidirectional traffic, 600B packets
- ~5G unidirectional traffic, 1500B packets

Everyone who has a spare CCR device could do the test by using traffic generator.

Can u record a video for this. I think if u do this Mikrotik will relate the topic.

kos · Mon Jan 20, 2020 1:21 pm

Can u record a video for this. I think if u do this Mikrotik will relate the topic.

Please refer to #103

I could make the conversation with Mikrotik support public, but it is very long and disappointing.

cabijuan · Mon Jan 20, 2020 2:10 pm

The only solution is to NOT BUY this model is a scam.

StubArea51 · Mon Jan 20, 2020 8:57 pm

The only solution is to NOT BUY this model is a scam.

I wouldn't say that. we have a lot of clients that use it successfully and when it first came out, we were able to sustain 80 Gbps of iperf traffic without issue.

I am curious about the config and conditions that are causing the reboots. From reading the history, it seems to be a different fix for each of the different people that are posting.

kos · Thu Jan 23, 2020 4:00 pm

So, could someone who has a working CCR1072 tell me what is wrong with the configuration on the video:

https://www.youtube.com/watch?v=J5arAJnI62I

sakirozkan · Thu Jan 23, 2020 10:37 pm

So, could someone who has a working CCR1072 tell me what is wrong with the configuration on the video:

https://youtu.be/TAWQRaplnsM

+++

glueck05 · Fri Jan 24, 2020 10:45 am

disable connection tracking

cabijuan · Fri Jan 24, 2020 11:08 am

disable connection tracking

It is not normal that a € 3000 router cannot have connection tacking enabled, right?

cabijuan · Fri Jan 24, 2020 11:09 am

So, could someone who has a working CCR1072 tell me what is wrong with the configuration on the video:

https://youtu.be/TAWQRaplnsM

Don't go crazy, it's not you, it's the router that is a scam.

kos · Fri Jan 24, 2020 3:44 pm

disable connection tracking

So, connection-tracking ON is wrong configuration?

Jeanluck · Fri Jan 24, 2020 6:19 pm

Go back to 6.38.3 and all will works fine... (close service ports for avoid vulnerabilities)

mada3k · Fri Jan 24, 2020 9:44 pm

Connection tracking is a NAT/Firewall feature. I'm not sure that Cisco/Junipers even does connection tracking in that manner.

But of course any device shouldn't reboot by itself.

kos · Mon Jan 27, 2020 3:06 pm

Go back to 6.38.3 and all will works fine... (close service ports for avoid vulnerabilities)

Absolutely the same with 6.38.3. I tested again. There is no working ROS version!

Maggiore81 · Mon Jan 27, 2020 3:08 pm

please show the firewall section.
did you enable FAST TRACK ?

kos · Mon Jan 27, 2020 4:23 pm

please show the firewall section.
did you enable FAST TRACK ?

Fast track just bypasses conn-track. It is clear that without conn-track the device performs much better.

Connection-tracking is mandatory in many setups!

Mikrotik do not warn users that CCR1072 is not able to perform normal with connection-tracking enabled!!! This is awful!!!

So, missing features are:
NAT
firewall:
connection-bytes
connection-mark
connection-type
connection-state
connection-limit
connection-rate
layer7-protocol
new-connection-mark
tarpit

Maggiore81 · Mon Jan 27, 2020 4:28 pm

I agree. but with fasttrack you can have conntrack enabled with very low CPU.

sakirozkan · Mon Feb 03, 2020 11:38 am

I changed connection tracking times. And now uptime 15 days with 6.46.1

enabled: auto
tcp-syn-sent-timeout: 2s
tcp-syn-received-timeout: 2s
tcp-established-timeout: 20s
tcp-fin-wait-timeout: 5s
tcp-close-wait-timeout: 5s
tcp-last-ack-timeout: 5s
tcp-time-wait-timeout: 5s
tcp-close-timeout: 5s
tcp-max-retrans-timeout: 1m
tcp-unacked-timeout: 1m
loose-tcp-tracking: yes
udp-timeout: 5s
udp-stream-timeout: 1m
icmp-timeout: 3s
generic-timeout: 1m
max-entries: 1048576

Maggiore81 · Mon Feb 03, 2020 12:49 pm

established put 1 hour

sakirozkan · Thu Feb 27, 2020 10:09 am

I changed connection tracking times. And now uptime 15 days with 6.46.1

enabled: auto
tcp-syn-sent-timeout: 2s
tcp-syn-received-timeout: 2s
tcp-established-timeout: 20s
tcp-fin-wait-timeout: 5s
tcp-close-wait-timeout: 5s
tcp-last-ack-timeout: 5s
tcp-time-wait-timeout: 5s
tcp-close-timeout: 5s
tcp-max-retrans-timeout: 1m
tcp-unacked-timeout: 1m
loose-tcp-tracking: yes
udp-timeout: 5s
udp-stream-timeout: 1m
icmp-timeout: 3s
generic-timeout: 1m
max-entries: 1048576

i use with this settings and uptime is 39 days.

Jeanluck · Sat Apr 11, 2020 2:44 pm

I upgrade 6.38.3 (working fine for years) to 6.45.8 and CCR1072 died completely within 48 hours. It was lost even the configuration.
I'm waiting for Mikrotik to tell me if it's a failure of the unit or RouterOS 6.45.8

mrtrca · Sat May 02, 2020 12:07 pm

According to our experience;
The only solution is to turn off connection-tracking.
There is no other solution.
I don't think the support team has an idea about this either.

Jeanluck · Sat May 02, 2020 1:05 pm

I bought a new 1072 and it works perfectly.
Conclusion: I think Mikrotik has modified something in the hardware of the CCR1072 since some date, and the first units (I don't know for how long) don't work with new RouterOS versions, and the newly bought ones do. This is just a hypothesis.
If all CCR1072 units had that problem with connection tracking, there would be major complaints and allegations.

cabijuan · Sun May 03, 2020 4:48 am

I sincerely believe that the option to disable tracking connectikn is not an option, I have not spent almost € 3000 to not have all the options. Mikrotik should solve this problem and not turn a deaf ear to this situation. which is not a € 100 router. worth a lot of money.

dmayan · Wed May 13, 2020 1:39 am

This is so hilarious.

New 1072 @1000mhz , 6.45.8 with routerboard upgraded (CRASHED RIGHT AS I'm WRITING THIS), dual PSUs from different Eaton Online UPS.

Things I did:
- Changed both PSUs.
- Changed all SFPs modules (As Mikrotik support blamed them after sending the autosupout)
- Disabled watchdog and captured console output. Zero output, router just freezes.
- No way to disable route cache. This board moves around 25gbps with CG-NAT.
- CHANGED THE WHOLE ROUTER... STILL REBOOTING... still the same behaviour. This is obviuosly a software issue, as the other 1072 went to a filtering role (With same traffic level) and DOESN'T REBOOT.

We can get almost a day of uptime in very good days.... normally it reboots every 4 or 5 hours.,

Seriously Mikrotik? What's your answer? We have 20 1072, and thousands of minor models. We are going the HUAWEI route now. I prefer chinese spying than your lies and promises.

cabijuan · Wed May 13, 2020 9:55 am

It is shameful that Mikrotik does not even answer this topic. It is your highest end router and it has problems, I hope people read this and don't buy it. DO NOT BUY THE MIKROTIK CCR1072. I only have 2, but they handle all my communications core. A router embarrassment.

Lonecrow · Wed May 13, 2020 4:22 pm

Oh snap. Same issue with 3 of them. Supout shows nothing and I've looked into this heavily.

Every 4-5 hours? WOW.

For me it seems to be almost like when some sort of counter is being hit. Some sort of threshold. It isn't traffic because it reboots when traffic is low too. But its like a counter needs to be reset and gets locked.

Jeanluck · Wed May 13, 2020 6:14 pm

In my experience, of 4 units I bought, 2 units were defective at the hardware level.
The ones I have now work fine with 6.44.6. I am afraid that in addition to a problem with RouterOS, there is a quality HW problem with this model.

Lonecrow · Wed May 13, 2020 6:48 pm

For those of you who have disabled connection tracking or tweaked the connection tracking settings

Was it successful?

sakirozkan · Thu May 21, 2020 2:49 pm

For those of you who have disabled connection tracking or tweaked the connection tracking settings

Was it successful?

We use it, it works very well. Before these settings, ccr1072 will restart every one or two days

1072.JPG

1072 connection.JPG

meshnet · Tue Jun 09, 2020 10:01 pm

Just FYI,
Brand new one we just turned up.. Same issues as all 4 before it.. Watchdog reboots..
CG-NAT/firewall/10+gb traffic = reboot..
Sent a supout.. probably get the same answer as everyone else..

R

kos · Wed Jun 10, 2020 11:48 am

Just FYI,
Brand new one we just turned up.. Same issues as all 4 before it.. Watchdog reboots..
CG-NAT/firewall/10+gb traffic = reboot..
Sent a supout.. probably get the same answer as everyone else..

R

Welcome to the club!

Better don't waste your time contacting the support. I think that your only chance is to try ROS 7, if it works for you. If ROS 7 doesn't work for you, you have to change the router.

Lonecrow · Thu Jun 18, 2020 7:55 pm

My boss is starting to force me toward Juniper. Please Mikrotik - do something about this. I have like 30 CCR's out there. Wouldn't want to be forced into changing things.

Today it rebooted on its own in the middle of the day, the boss was stock trading and lost a lot of money. Needless to say this issue is problematic for us.

I did make the changes you suggest up there in the connection tracking. I need to keep it enabled because of the NAT we do.

dmayan · Thu Jun 18, 2020 8:02 pm

Changing the hardware didn't work for me. The same configuration shielded the same results on two zero hs CCRs. The first one that rebooted constantly, now has 130 days of uptime on another configuration. It's something on the software side, at least for me.

Just FYI,
Brand new one we just turned up.. Same issues as all 4 before it.. Watchdog reboots..
CG-NAT/firewall/10+gb traffic = reboot..
Sent a supout.. probably get the same answer as everyone else..

R
Welcome to the club!

Better don't waste your time contacting the support. I think that your only chance is to try ROS 7, if it works for you. If ROS 7 doesn't work for you, you have to change the router.

Maggiore81 · Fri Jun 19, 2020 8:03 am

Now went out the ccr2004 with a lot of 10g ports. It may work...

hytanium · Fri Jun 19, 2020 2:14 pm

Confirmed that turning off connection tracking eliminates issue...moving over 9Gbps traffic through.

Maggiore81 · Sat Jun 20, 2020 9:05 am

It was tried with fasttrack and default timings, excdpt estsblished that fa be safely set to 5 min?

Lonecrow · Tue Jul 21, 2020 8:10 pm

What about those of us who NEED to leave conntrack on? I need to do some nat at this router so some of the devices behind it can get updates and they can't be natted further up or down.

These constant weekly reboots are getting out of hand.

I went from scheduled reboots to one firmware version later I'm getting weekly and sometimes every few days a reboot. The supout tells us nothing.

Mikrotik where are you? There have been others that use this product that have the same exact issue.

degree · Sat Sep 26, 2020 4:48 pm

Have the same problem as described on two different 1072s. We’ve run traffic generators, and they handle packets up to 100% cpu without any problems, but there’s not alot of connections in various «traffic generators». Anyone know a generator to create alot of connections, to try and see if connection tracking solves it? I’d rather not put them back in production to test if disabling tracking solves it.

CoMMyz · Sat Sep 26, 2020 11:14 pm

CCR1072 -> Doing a total of 5G throughput with a lot of connections and 1 DHCP Server + IPV6 + DNS Server + SNMP + 33 Vlans + NAT (106 rules) + Firewall (46 rules) + Raw firewall (66 rules) + Routing (2776 routes) + 7 BGP peers and 2 instances + OSPF + Watchdog enabled - no queuing, no discovery, no cloud. Only a few tunnels about 10 L2TP with PPPoE. Bridge filtering is active.

Never had an issue with stable version 6.45.9. Now on 6.46.6 with 122days uptime again no issues. CPU is at 1000mhz.

kos · Mon Sep 28, 2020 10:34 am

CCR1072 -> Doing a total of 5G throughput with a lot of connections and 1 DHCP Server + IPV6 + DNS Server + SNMP + 33 Vlans + NAT (106 rules) + Firewall (46 rules) + Raw firewall (66 rules) + Routing (2776 routes) + 7 BGP peers and 2 instances + OSPF + Watchdog enabled - no queuing, no discovery, no cloud. Only a few tunnels about 10 L2TP with PPPoE. Bridge filtering is active.

Never had an issue with stable version 6.45.9. Now on 6.46.6 with 122days uptime again no issues. CPU is at 1000mhz.

Check post #127

Probably your traffic consist mostly of big packets and it is distributed over more than two interfaces.

In my opinion, activating more features (some of them) in fact improves device condition, because the traffic is distributed over more cores. For some Mikrotik devices, I have seen better test results accomplished in NAT mode then in just routing.

Good luck!

CoMMyz · Mon Oct 05, 2020 2:13 am

Traffic exists on a single 10G SFP port with a VLAN tag as well on top of this.
Traffic consists mostly of small packets - TX/RX 65-127 is actually the largest number out of all of them.

The cause of the watchdog reboots its probably some specific features/items indeed.

CCR1072 -> Doing a total of 5G throughput with a lot of connections and 1 DHCP Server + IPV6 + DNS Server + SNMP + 33 Vlans + NAT (106 rules) + Firewall (46 rules) + Raw firewall (66 rules) + Routing (2776 routes) + 7 BGP peers and 2 instances + OSPF + Watchdog enabled - no queuing, no discovery, no cloud. Only a few tunnels about 10 L2TP with PPPoE. Bridge filtering is active.

Never had an issue with stable version 6.45.9. Now on 6.46.6 with 122days uptime again no issues. CPU is at 1000mhz.
Check post #127

Probably your traffic consist mostly of big packets and it is distributed over more than two interfaces.

In my opinion, activating more features (some of them) in fact improves device condition, because the traffic is distributed over more cores. For some Mikrotik devices, I have seen better test results accomplished in NAT mode then in just routing.

Good luck!

kos · Mon Oct 05, 2020 10:24 am

Traffic exists on a single 10G SFP port with a VLAN tag as well on top of this.
Traffic consists mostly of small packets - TX/RX 65-127 is actually the largest number out of all of them.

The cause of the watchdog reboots its probably some specific features/items indeed.

CCR1072 -> Doing a total of 5G throughput with a lot of connections and 1 DHCP Server + IPV6 + DNS Server + SNMP + 33 Vlans + NAT (106 rules) + Firewall (46 rules) + Raw firewall (66 rules) + Routing (2776 routes) + 7 BGP peers and 2 instances + OSPF + Watchdog enabled - no queuing, no discovery, no cloud. Only a few tunnels about 10 L2TP with PPPoE. Bridge filtering is active.

Never had an issue with stable version 6.45.9. Now on 6.46.6 with 122days uptime again no issues. CPU is at 1000mhz.
Check post #127

Probably your traffic consist mostly of big packets and it is distributed over more than two interfaces.

In my opinion, activating more features (some of them) in fact improves device condition, because the traffic is distributed over more cores. For some Mikrotik devices, I have seen better test results accomplished in NAT mode then in just routing.

Good luck!

That is interesting. You may have some new hardware version. Could you check your revision (/system routerboard print).

It is not about specific features, just connection-tracking. See the video below.

https://www.youtube.com/watch?v=J5arAJnI62I

antoxic · Tue Oct 06, 2020 5:24 pm

Traffic exists on a single 10G SFP port with a VLAN tag as well on top of this.
Traffic consists mostly of small packets - TX/RX 65-127 is actually the largest number out of all of them.

The cause of the watchdog reboots its probably some specific features/items indeed.

CCR1072 -> Doing a total of 5G throughput with a lot of connections and 1 DHCP Server + IPV6 + DNS Server + SNMP + 33 Vlans + NAT (106 rules) + Firewall (46 rules) + Raw firewall (66 rules) + Routing (2776 routes) + 7 BGP peers and 2 instances + OSPF + Watchdog enabled - no queuing, no discovery, no cloud. Only a few tunnels about 10 L2TP with PPPoE. Bridge filtering is active.

Never had an issue with stable version 6.45.9. Now on 6.46.6 with 122days uptime again no issues. CPU is at 1000mhz.
Check post #127

Probably your traffic consist mostly of big packets and it is distributed over more than two interfaces.

In my opinion, activating more features (some of them) in fact improves device condition, because the traffic is distributed over more cores. For some Mikrotik devices, I have seen better test results accomplished in NAT mode then in just routing.

Good luck!

That is interesting. You may have some new hardware version. Could you check your revision (/system routerboard print).

It is not about specific features, just connection-tracking. See the video below.

https://www.youtube.com/watch?v=J5arAJnI62I

Hi guys!
To begin, I just want to give 5stars to the MX80 solution! We are heading towards that way, slowly, replacing Mikrotik with other vendors.

We have a CCR1072, for a very long time it had 6.37.4 and it was rebooting itself once every few months, but recently, we had 2 reboots within 2 months.So, we decided to upgrade it. We installed 6.45.9 last night. To begin, the router did not have all the configuration after the upgrade. And now (less than 12h later), I just had a reboot by watchdog with no supout file generated! And we had less than 1G of traffic.

Now, the device is working on 1000Mhz and I will try to disable connection tracking. I will let you know If something changes.

Btw, we used 6.37.4 because newer versions were really unstable and could not handle the same amount of trafic.

abdurrazaqa · Wed Oct 14, 2020 10:02 am

Guys, i experienced the WATCHDOG reboot on CCR1072

1. Using only for CGNAT ( 514 nat rules entries for 65000 connections using netmap) and PBR
2. Not using any routing protocols.

It is running on v6.46.7

Its run successfully for 4 days carrying 5.2Gbps with 48% cpu load,

All of sudden rebooted by watchdog, eventhough the connection tracking entries are at around 900000.

I would like to know the impact of setting tcp-establistished timeout=15m

tik guys kindly responds to this forum, as it has been active for some years expecting the answer

antoxic · Wed Oct 14, 2020 10:46 am

Guys, i experienced the WATCHDOG reboot on CCR1072

1. Using only for CGNAT ( 514 nat rules entries for 65000 connections using netmap) and PBR
2. Not using any routing protocols.

It is running on v6.46.7

Its run successfully for 4 days carrying 5.2Gbps with 48% cpu load,

All of sudden rebooted by watchdog, eventhough the connection tracking entries are at around 900000.

I would like to know the impact of setting tcp-establistished timeout=15m

tik guys kindly responds to this forum, as it has been active for some years expecting the answer

You should open a ticket and send them the supout.inf file generated automatically after watchdog rebooted your router. I don't think you will get any help from this topic, only guesses.

kos · Wed Oct 14, 2020 11:22 am

Date of first post here is Jun 12, 2017. I think that a lot of tickets have been opened and until now nobody has been helped.

abdurrazaqa · Thu Oct 15, 2020 3:08 pm

Guys, i experienced the WATCHDOG reboot on CCR1072

1. Using only for CGNAT ( 514 nat rules entries for 65000 connections using netmap) and PBR
2. Not using any routing protocols.

It is running on v6.46.7

Its run successfully for 4 days carrying 5.2Gbps with 48% cpu load,

All of sudden rebooted by watchdog, eventhough the connection tracking entries are at around 900000.

I would like to know the impact of setting tcp-establistished timeout=15m

tik guys kindly responds to this forum, as it has been active for some years expecting the answer
You should open a ticket and send them the supout.inf file generated automatically after watchdog rebooted your router. I don't think you will get any help from this topic, only guesses.

unfortunately support file is not created after the reboot

abdurrazaqa · Thu Oct 15, 2020 3:10 pm

ngw01.JPG

antoxic · Thu Oct 15, 2020 3:21 pm

ngw01.JPG

Do you have "Automatic Supout" enabled in System > Watchdog menu?

watchdog.jpg

If you do have that enabled and it still not generate a file, you can do it via System > Scheduler by adding new script which should be executed at boot.

supout-script.jpg

antoxic · Thu Oct 15, 2020 3:24 pm

ngw01.JPG

Did you have this reboots before? Is it possible that they started after you have done a firmware upgrade and everything was ok with the previous firmware?

abdurrazaqa · Fri Oct 16, 2020 9:28 am

It is a recent purchase of 8 units like 12 days before, i have done the software and routerboard upgrade prior to configuring it.

So far experienced one time reboot in one unit.

Configured total of four units in production,
1. Two units IGW01 and IGW02( doing BGP ) no natting, connection tracking disabled ( no issues so far for 10 days ).
2. Two units NGW01 and NGW02 ( NAting with netmap and PBR ), NGW01 is rebooted one time by watchdog, thankfully at the time NGW01 down, NGW02 backed up

Thanks for sharing the script, i will configure it and see if it happens again

abdurrazaqa · Mon Oct 19, 2020 9:30 am

We had a reboot yesterday, this time it is NGW02 for first time and supout.rif is created

Let me create a ticket and see

abdurrazaqa · Tue Oct 20, 2020 8:07 am

no errors on the supout.rif, since it is considered as a normal boot, supout.rif i need to create while it is freezed, i need to disable the watchdog for that.

Guys please advice me on the protection for CGNAT enabled Router from DDOS etc

antoxic · Tue Oct 20, 2020 8:28 am

no errors on the supout.rif, since it is considered as a normal boot, supout.rif i need to create while it is freezed, i need to disable the watchdog for that.

Guys please advice me on the protection for CGNAT enabled Router from DDOS etc

Wow, this is insane. You probably won't be able to do that because your router will be hanging. Awesome support. 5 stars.

The fastest and easiest solution that I can advice is to run RouterOS on an x86 machine, or, maybe on a CHR, but It may be too much for a virtualized device. A bare-metal x86 machine running RouterOS with a big CPU (or two) should be ok.

The right solution is to go for another vendor. I know that Huawei has some CGNAT devices (I prefer them spying on me rather than those reboots). I like Juniper, but the MX240 need a separate line card for doing that. The MX204 should fulfill all your needs. Any used Juniper device will give you better results. Mikrotik is ok just to begin, when you grow up, you need to find something serious.

abdurrazaqa · Wed Oct 21, 2020 12:47 pm

thank you for the feedback

doush · Wed Oct 21, 2020 8:49 pm

Isnt it weird that MT is completely silent about this issue ? :)

antoxic · Thu Oct 22, 2020 5:45 pm

Isnt it weird that MT is completely silent about this issue ? :)

They just can't fix it and they don't want to admit that they sell ´buggy´ equipment. I guess that their sales are going ok, and it is not happening to everybody, so they just can ignore it. We replaced one of the CCR1072 with a Juniper MX and now the CCR is serving a small office doing some dhcp, nat, firewall and other things typical for a very small office, and believe me it is working perfectly (showing 0% load). Well, you will think something like, WHAT!!?? 3k router for 15 people? And the answer is yes, because the other option was using it to replace a broken leg of a storage shelf.

I almost stopped hating Mikrotik after I figured out their limits. They just can't compete with hardware guys and ASICS. Just use them to start, earn some money and buy propper equipment. That's it! :)

P.S.: It is not their first router with reboots. I have an RB4011 at home and it was randomly rebooting once a month o something like that. At some point it stopped, but I'm not sure if it is because I've disabled something or because the new firmware. And yes, I have also bought 2x SRX300 to replace it with a cluster, I just can't live with the feeling that my router can reboot at any time, having 2 internet providers connected and a UPS, i'ts just not right.

Maggiore81 · Mon Oct 26, 2020 2:10 pm

Are there news on this issue?
I would need to get an upgrade for our 1036, and I was looking at 1072 but I have read so many issues.
Maybe in the recent hardware the issue is resolved?

abdurrazaqa · Tue Oct 27, 2020 4:19 pm

Hi,
It is based on your configs also,
I have tweaked the connection tracking, so far no reboots for a week

Maggiore81 · Tue Oct 27, 2020 5:02 pm

Perfect
are you using FASTTRACK ? can you post your config (without sensistive informations) ?

abdurrazaqa · Wed Oct 28, 2020 8:24 am

Hi,

My config is very simple,

I am using this Router as NGW01 & NGW02 ( NATGAEWAY ) , Major purpose is to do Nating for 65000 active connections, i am using Netmap NAT( to achieve this i have aroud 514 NAt rules) to track the Users
Redundancy between the above Router is based on PBR

Allow fast patch is enabled

ip firewall connection tracking print
enabled: auto
tcp-syn-sent-timeout: 2s
tcp-syn-received-timeout: 2s
tcp-established-timeout: 1h
tcp-fin-wait-timeout: 5s
tcp-close-wait-timeout: 5s
tcp-last-ack-timeout: 5s
tcp-time-wait-timeout: 5s
tcp-close-timeout: 5s
tcp-max-retrans-timeout: 1m
tcp-unacked-timeout: 1m
loose-tcp-tracking: yes
udp-timeout: 5s
udp-stream-timeout: 1m
icmp-timeout: 3s
generic-timeout: 1m
max-entries: 1048576
total-entries: 725206

I am suspecting the reboot earlier i noticed possibly either DDOS attack that caused the overflow of connections

CURRENT CPU LOAD:
uptime: 2w22h29m21s
version: 6.46.7 (long-term)
build-time: Sep/07/2020 07:38:56
factory-software: 6.28
free-memory: 14.2GiB
total-memory: 15.8GiB
cpu: tilegx
cpu-count: 72
cpu-frequency: 1000MHz
cpu-load: 26%
free-hdd-space: 83.2MiB
total-hdd-space: 128.0MiB
architecture-name: tile
board-name: CCR1072-1G-8S+
platform: MikroTik

As you can see the load is 26%, My current traffic on each router is 2.8Gbps, so i am expecting the cpu will run-out once i reach around 8Gbps

My advice is if you are buying CCR1072, make sure your setup is with redundancy, so that in case its reboot you have time to react and to find the RCA

Maggiore81 · Wed Oct 28, 2020 10:29 am

Well
I have 1036 with about 5Gig connections, and we do conntrack + fasttrack and we are about at 20% at peak time.
Are you sure that fasttrack is enabled correctly ?

Can you print your config with hide-sensitive? Or you can send privately in a private message?
I am about to buy a 1072 and I want to be sure that everything is fine.

joarc · Wed Oct 28, 2020 3:16 pm

We haven't had any watchdog reboots on our multiple 1072, and we only use them for routing (OSPF and BGP) and MPLS/VPLS, no conntrack or firewall or vpns or stuff like that. We have had them crash due to DDoS-attacks, but other then that, they work perfectly.

abdurrazaqa · Wed Oct 28, 2020 3:30 pm

Well
I have 1036 with about 5Gig connections, and we do conntrack + fasttrack and we are about at 20% at peak time.
Are you sure that fasttrack is enabled correctly ?

Can you print your config with hide-sensitive? Or you can send privately in a private message?

My skype id: abdulrazaq.a@hotmail.com

I am about to buy a 1072 and I want to be sure that everything is fine.

Maggiore81 · Tue Nov 03, 2020 11:57 am

I wrote to you via skype but you didnt answer me.
Try these settings and tell me if it reboots.

abdurrazaqa · Tue Nov 03, 2020 2:01 pm

sorry i haven't got any message on my skype
May be you can share your id, i will send the request

faraya · Tue Nov 10, 2020 7:50 pm

Hello, good afternoon, I have a CCR1072 router working a couple of months ago, these last days it has started to restart every 3 or 4 days, the router is being used as a BGP edge router, the cpu usage is between 7 to 10% Does anyone know how to fix it or the equipment is bad ?

"router was rebooted without proper shutdown by watchdog timer"

Kind regards from Chile.

antoxic · Sat Nov 14, 2020 8:14 pm

Hello, good afternoon, I have a CCR1072 router working a couple of months ago, these last days it has started to restart every 3 or 4 days, the router is being used as a BGP edge router, the cpu usage is between 7 to 10% Does anyone know how to fix it or the equipment is bad ?

"router was rebooted without proper shutdown by watchdog timer"

Kind regards from Chile.

Try disabling connection tracking.

Gerlach76 · Sun Mar 07, 2021 8:42 pm

does it help to disable connection tracking?

antoxic · Sun Mar 07, 2021 8:57 pm

It did help us.

I've completely disabled connection tracking and the statefull firewall. I'm running 6.45.9 for 152d on CCR1072.

idst · Tue May 04, 2021 4:57 pm

For me, the solution was disable:

IP > Cloud > Update Time

and enable:

System > SNTP Client

Since then, I pass from 1 or 2 watchdog reboots a day to +6 days an counting.

I also tried before disable conntrack for some IPs and lower time for TCP Established Timeout from 1 day to 30min. But any of them did difference on reboots.

UPDATE after 1 month:

The problem is here again, random reboots, seems to be related with the "TCP Established Timeout " of the conntrack

Mon May 10, 2021 10:33 pm

Re unwanted WatchDog reboots.

I do not use the ROS WatchDog settings/function.
Instead, I use the NetWatch feature to trigger a script if/when a NetWatch ping fails.
The script then enters a count-down loop which is something like this:

#1 - Log date/time & message & Loop-Count variable to logs
#2 - If ping successful - then exit ( quit the script )
#3 - Add a 1 to a Loop-Count variable
#4 - Sleep 15 seconds
#5 - If Loop-Count variable = 30 ( 7.5 minutes ) then reboot this Mikrotik
#6 - If Loop-Count variable = 20 , then do a site-survey and save scan results to file
#7 - loop ( go to ) #1

I use a NetWatch WatchDog Reboot script similar to this on over 1-thousand Mikrotiks connected to my networks.
I test-ping to a special IP address in my server room ( 192.0.2.254 ).

Note - Sometimes a remote client site-survey ( scan ) at #6 is all that is necessary to get a remote client to connect/re-connect if the client did not receive a DHCP IP address.

Kurlec · Mon May 31, 2021 12:12 pm

We had the same problem on our BGP 1072, just turn off connection tracking and you wont experience any more reboots, its due to hardware limitations not one mikrotik router can smoothly handle 2.5Gbps of bidirectional traffic at once, the connection tracking causes a spike in the CPU performance then crashes.

if you dont have NATTing or so, just turn off the connection tracking, ip > firewall > connection tracking > set enabled=no