HI All
Im hoping anyone else experienced this and can answer me. We have a rather large network and regularly use ROMON to access our network. Our basic network architecture is about 4 levels deep, Core network, Broad wireless network, Client CPE's, Client switches.
On Friday 24/01 at around 11:20 SAST we had a synchronous network drop for about 20-40 seconds. 80% of of our core network, 60% of our wireless network, and about half our clients shows on our monitoring system that the devices disconnected. Some of our devices reconnected in a "hanged" state and needed to be power cycled to restore functionality. THe issue was experienced on a few CCR1036's, a bunch of LHG radios, some RB2011's and a whole bunch of CRS 326's. Firmware on all these devices vary from 6.42.x to 6.44.x (either LTR or Stable versions, no beta or dev versions)
The only thing we can see that all these devices had in common was Romon was enabled an ALL effected devices. THe devices we have on the network that does not show the drop, did not have ROMON enabled.
Has anyone else experienced something in this effect, or can shed light on how or where to start looking for the cause of this issue.
Logs only report BGP failure or port disconnects. Router that were in hanged state, simply report power failure. Services like watchdog also did not run to autoboot the routers.
Please help?