False. 95% of my routers don't have the NTP package installed and they all crashed badly.I can confirm that some CCR units experienced a crash due to introduction of leap second
Only those CCR units were affected, that use the client inside NTP npk package. It currently seems the issue was in linux kernel, the bug was fixed, but RouterOS did not have this kernel fix yet.
If the CCR uses the default SNTP client (ie. NTP.npk is not installed) then nothing happened.
Little too late, don't you think? When is the next leap second? I won't have any mikrotik devices on my networks when it happens.We have found how to fix the issue in the kernel, fix is coming soon.
For this one, yes, but next leap second will be added in around 2 years.Little too late, don't you think?
NTP, no SNTP. 6.29, build time May/27/2015 11:19:36.Could you please tell me if you had NTP package on all the servers, or you used SNTP?
Unless a bug in the hardware driver of some NTP server triggers an unexpected leap second (like it happened to me on 1st April, http://forum.mikrotik.com/viewtopic.php?f=3&t=95455 , or unless a malicious user wants to bring down an entire ISP network by hacking one public NTP server.next leap second will be added in around 2 years.
We have found how to fix the issue in the kernel, fix is coming soon.
That is very interesting, maybe those units used a different NTP server? Because NTP package and Kernel were not changed in 6.18 or even since any v6 versionHi, we have Mikrotiks everywhere + around 20 CCRs. NTP is configured on all devices in our network.
Conclusion is below :
1. RBs with NTP client or SNTP client were not affected. versions from 6.13 to 6.28
2. Affected were only CCRs with version after 6.20 and NTP client running on it.
For example CCR with 6.18 was not affected, even it has NTP running.
that post is from April?Yes, they all crashed: http://forum.mikrotik.com/viewtopic.php?f=3&t=95455
When you scroll down, you will find more recent posts from today. The reason is the same. On 1st of April there was a leap second insertion on some Italian nameservers and today it was worldwide: http://forum.mikrotik.com/viewtopic.php ... 99#p488599that post is from April?Yes, they all crashed: http://forum.mikrotik.com/viewtopic.php?f=3&t=95455
Can you please explain what you mean by "server that have proper Leap Second implementation" ? That is, how does the "proper" implementation differ from "time adjustment on next synchronization" ?The problem happens only if the following criteria is met:
1) 64bit RouterOS (only tile)
2) any RouterOS v6.x
3) installed and synchronized NTP client from NTP package (NOT the default SNTP client)
4) synchronization to server that have proper Leap Second implementation, not just time adjustment on next synchronization
Status update!
4) synchronization to server that have proper Leap Second implementation, not just time adjustment on next synchronization
This:Can you please explain what you mean by "server that have proper Leap Second implementation" ? That is, how does the "proper" implementation differ from "time adjustment on next synchronization" ?
The NTP packet includes a leap second flag, which informs the user that a leap second is imminent. This, among other things, allows the user to distinguish between a bad measurement that should be ignored and a genuine leap second that should be followed.
only CCR was affected by this. RB1100 and all other devices worked fineDoes the MT1100AHx2 on release 5.26 have this proper Leap Second implementation.
Normis, what I meant, has the NTP-Server based on release 5.26 on MT1100AHx2 this proper SERVER implementation?only CCR was affected by this. RB1100 and all other devices worked fineDoes the MT1100AHx2 on release 5.26 have this proper Leap Second implementation.
This won't protect you from unexpected (wrong) leap seconds. A few public NTP servers here in Italy have been affected during March and applied the leap second on 1st of April. A deeper investigation related the cause to a bug into an hardware clock driver...It would have been helpful to have a patch from MikroTik, but for most of the customer networks we manage, we began leap second planning a while ago and removed any equipment from an NTP server that was suspect until the leap second passed and then re-enabled it. That proved to be a very simple, yet effective mitigation technique to script even on some of the larger networks we work on (50,000+ network devices)
One CCR crashed just 10 minutes ago, so it might not be a one time event.Leap Second was a one time only event. It has passed. You can use any release now.
We will make a fix today that will make sure you don't see this issue again in 2-3 years, when next leap second happens
While unrelated bugs can't be ruled out, this specific issue is tied to the processing of a leap second event from the NTP subsystem to the linux kernel.So, using NTP on CCR is sure to be the largest contributing factor, but it's not 100% limited to that scope.
I can confirm CCR's with SNTP were OK and CCR's with NTP crashed and became unresponse.For this one, yes, but next leap second will be added in around 2 years.Little too late, don't you think?
Could you please tell me if you had NTP package on all the servers, or you used SNTP?
Certainly getting the code patched is the ideal, but planning for a known network issue that will happen at a specific date and time and defending against daily attacks are two different animals.This won't protect you from unexpected (wrong) leap seconds. A few public NTP servers here in Italy have been affected during March and applied the leap second on 1st of April. A deeper investigation related the cause to a bug into an hardware clock driver...It would have been helpful to have a patch from MikroTik, but for most of the customer networks we manage, we began leap second planning a while ago and removed any equipment from an NTP server that was suspect until the leap second passed and then re-enabled it. That proved to be a very simple, yet effective mitigation technique to script even on some of the larger networks we work on (50,000+ network devices)
This won't protect also from a malicious hacker who could break into a public NTP server and crash the whole network.