Page 1 of 1

RB4011 drops and snmp

Posted: Sat Apr 04, 2020 11:07 am
by meetriks2
Hi,

I have a RB4011 with only one 10 Gbit link routing/firewalling traffic. (6.44.6, and firmware 6.43.7)
Works okay for about 18 months. Today I noticed some counters which did not show up in the monitoring or should not be there.
 [admin@x] /interface ethernet> print stats        
                      name:  ether1 ether2 ether3 ether4 ether5 ether6 ether7 ether8 ether9 ether10    sfp-sfpplus1
            driver-rx-byte:       0      0      0      0      0      0      0      0      0       0 453 418 729 475
          driver-rx-packet:       0      0      0      0      0      0      0      0      0       0     465 357 696
            driver-tx-byte:       0      0      0      0      0      0      0      0      0       0 454 750 372 102
          driver-tx-packet:       0      0      0      0      0      0      0      0      0       0     466 309 144
                  rx-bytes:       0      0      0      0      0      0      0      0      0       0 445 042 439 047
                 rx-packet:                                                                             465 357 696
              rx-too-short:                                                                                       0
                     rx-64:                                                                                 528 421
                 rx-65-127:                                                                             145 182 164
                rx-128-255:                                                                              14 143 095
                rx-256-511:                                                                               7 054 258
               rx-512-1023:                                                                               9 783 903
              rx-1024-1518:                                                                             107 376 831
               rx-1519-max:                                                                             181 294 219
               rx-too-long:                                                                                       0
                rx-unicast:       0      0      0      0      0      0      0      0      0       0
              rx-broadcast:       0      0      0      0      0      0      0      0      0       0         286 518
                  rx-pause:       0      0      0      0      0      0      0      0      0       0               0
              rx-multicast:       0      0      0      0      0      0      0      0      0       0         633 426
              rx-fcs-error:       0      0      0      0      0      0      0      0      0       0               0
            rx-align-error:                                                                                       0
               rx-fragment:       0      0      0      0      0      0      0      0      0       0               0
             rx-unknown-op:       0      0      0      0      0      0      0      0      0       0
           rx-length-error:                                                                                       0
             rx-code-error:       0      0      0      0      0      0      0      0      0       0
                 rx-jabber:       0      0      0      0      0      0      0      0      0       0               0
                   rx-drop:       0      0      0      0      0      0      0      0      0       0           5 195
                  tx-bytes:       0      0      0      0      0      0      0      0      0       0 446 488 611 935
                 tx-packet:                                                                             466 309 502
                tx-unicast:       0      0      0      0      0      0      0      0      0       0
              tx-broadcast:       0      0      0      0      0      0      0      0      0       0       1 445 279
                  tx-pause:       0      0      0      0      0      0      0      0      0       0             358
              tx-multicast:       0      0      0      0      0      0      0      0      0       0          12 411
              tx-collision:       0      0      0      0      0      0      0      0      0       0
    tx-excessive-collision:       0      0      0      0      0      0      0      0      0       0
     tx-multiple-collision:       0      0      0      0      0      0      0      0      0       0
       tx-single-collision:       0      0      0      0      0      0      0      0      0       0
               tx-deferred:       0      0      0      0      0      0      0      0      0       0
         tx-late-collision:       0      0      0      0      0      0      0      0      0       0
                   tx-drop:       0      0      0      0      0      0      0      0      0       0
                  tx-rx-64:       0      0      0      0      0      0      0      0      0       0
              tx-rx-65-127:       0      0      0      0      0      0      0      0      0       0
             tx-rx-128-255:       0      0      0      0      0      0      0      0      0       0
             tx-rx-256-511:       0      0      0      0      0      0      0      0      0       0
            tx-rx-512-1023:       0      0      0      0      0      0      0      0      0       0
           tx-rx-1024-1518:       0      0      0      0      0      0      0      0      0       0
I cleared the counters a couple of time. As you can see there are some RX drops. But if you check this command:
[admin@x] /interface>> print stats-detail where name="sfp-sfpplus1" or type=vlan
Flags: D - dynamic, X - disabled, R - running, S - slave 
 0  R  name="sfp-sfpplus1" last-link-up-time=dec/18/2019 11:45:43 link-downs=0 rx-byte=57 095 702 786 201 tx-byte=57 036 216 809 471 rx-packet=55 890 915 225 
       tx-packet=55 887 109 204 rx-drop=0 tx-drop=0 tx-queue-drop=0 rx-error=0 tx-error=0 fp-rx-byte=57 095 702 786 201 fp-tx-byte=57 036 216 809 471 
       fp-rx-packet=55 890 915 225 fp-tx-packet=55 887 109 204 

 1  R  
       name="VL_200" last-link-up-time=dec/18/2019 11:45:43 link-downs=0 rx-byte=47 728 898 175 960 tx-byte=9 135 144 323 972 rx-packet=37 947 970 137 
       tx-packet=17 990 816 234 rx-drop=0 tx-drop=0 tx-queue-drop=0 rx-error=0 tx-error=0 fp-rx-byte=47 728 726 273 533 fp-tx-byte=0 fp-rx-packet=37 947 820 793 
       fp-tx-packet=0 

 2  R  
       name="VL_1021" last-link-up-time=dec/18/2019 11:45:43 link-downs=0 rx-byte=9 142 539 258 145 tx-byte=47 677 524 048 683 rx-packet=17 937 490 060 
       tx-packet=37 896 292 970 rx-drop=0 tx-drop=0 tx-queue-drop=0 rx-error=0 tx-error=0 fp-rx-byte=9 142 452 149 432 fp-tx-byte=0 fp-rx-packet=17 937 385 856 
       fp-tx-packet=0

My problem is how to monitor the drops? “print oid” leads to values from the main interface.
It seems there are no snmp oid for the Ethernet part. I did a snmpwalk and grep on the number and could not find any result.
Should the main/root interface not show the same amount? That would solve the issue.

Second, the system sends “tx-pause”, which make sense with the rx-drops, but flowcontrol is disabled…
For clarity I enabled it on the disabled link ether1. So it shows in the config.
[admin@x] /interface ethernet> export
/interface ethernet
set [ find default-name=ether1 ] disabled=yes rx-flow-control=on tx-flow-control=on
set [ find default-name=ether2 ] disabled=yes
set [ find default-name=ether3 ] disabled=yes
set [ find default-name=ether4 ] disabled=yes
set [ find default-name=ether5 ] disabled=yes
set [ find default-name=ether6 ] disabled=yes
set [ find default-name=ether7 ] disabled=yes
set [ find default-name=ether8 ] disabled=yes
set [ find default-name=ether9 ] disabled=yes
set [ find default-name=ether10 ] comment=MGT poe-out=off
set [ find default-name=sfp-sfpplus1 ] advertise=10000M-full auto-negotiation=no
Third, which is rather strange:
[admin@x] /interface ethernet switch> print stats
                name:        switch1 switch2
      driver-rx-byte:              0       0
    driver-rx-packet:              0       0
      driver-tx-byte:              0       0
    driver-tx-packet:              0       0
            rx-bytes:              0       0
           rx-packet:              5       0
        rx-too-short:              0       0
               rx-64:  2 130 907 700       0
           rx-65-127:              0       0
          rx-128-255:              0       0
          rx-256-511:              0       0
         rx-512-1023:              0       0
        rx-1024-1518:              0       0
         rx-1519-max:              0       0
         rx-too-long:            936       0
        rx-broadcast:            944       0
            rx-pause:     41 740 288       0
        rx-multicast:              0       0
        rx-fcs-error:              0       0
      rx-align-error:     41 740 288       0
         rx-fragment:            904       0
     rx-length-error:              0       0
           rx-jabber:  2 130 907 560       0
             rx-drop:              0       0
            tx-bytes:              0       0
           tx-packet:            920       0
        tx-broadcast:              0       0
            tx-pause:              0       0
        tx-multicast:              0       0
For my understanding, the RB4011 has the SFP+ directly connected to the cpu and not the switch..
I enabled ether1, and suddenly I have a lot more error there.
[admin@x] /interface ethernet switch> print stats
                name:        switch1 switch2
      driver-rx-byte:              0       0
    driver-rx-packet:              0       0
      driver-tx-byte:              0       0
    driver-tx-packet:              0       0
            rx-bytes:              0       0
           rx-packet:              5       0
        rx-too-short:  1 107 053 440       0
               rx-64:  2 130 907 700       0
           rx-65-127:              0       0
          rx-128-255:              0       0
          rx-256-511:              0       0
         rx-512-1023:              0       0
        rx-1024-1518:              0       0
         rx-1519-max:              0       0
         rx-too-long:            936       0
        rx-broadcast:            944       0
            rx-pause:  1 167 253 504       0
        rx-multicast:  1 107 054 592       0
        rx-fcs-error:  1 103 788 244       0
      rx-align-error:  1 167 253 504       0
         rx-fragment:            904       0
     rx-length-error:              0       0
           rx-jabber:  2 130 907 560       0
             rx-drop:  1 103 787 704       0
            tx-bytes:              0       0
           tx-packet:            920       0
        tx-broadcast:  1 096 696 456       0
            tx-pause:  1 096 696 200       0
        tx-multicast:              0       0
I assume the counters are the 2.5 Gbit links to the switch chips…
But there is no traffic on the switches. Switch1 was never active, and switch2 is active but ether10 is never used (during the uptime)

I’m aware it’s running the 6.44.6 version, I will plan an upgrade of the both the ROS version and the firmware.
Anyone any thought on how to monitor the drops?

Cheers,
Harry

Re: RB4011 drops and snmp

Posted: Tue Apr 14, 2020 1:18 pm
by meetriks2
Hi,

Nobody any clues? Am i the only one see this?

I have upgraded the ROS version to 6.45.8 and firmware to 6.44.6

Some more info on the port:
[admin@xxx] /interface ethernet> monitor 10       
                    name: sfp-sfpplus1
                  status: link-ok
        auto-negotiation: disabled
                    rate: 10Gbps
             full-duplex: yes
         tx-flow-control: no
         rx-flow-control: no
      sfp-module-present: yes
             sfp-rx-loss: no
            sfp-tx-fault: no
                sfp-type: SFP-or-SFP+
      sfp-connector-type: LC
    sfp-link-length-50um: 80m
    sfp-link-length-62um: 30m
         sfp-vendor-name: MikroTik
  sfp-vendor-part-number: S+A00005
     sfp-vendor-revision: A
       sfp-vendor-serial: MISSRJ22011
  sfp-manufacturing-date: 18-10-19
          sfp-wavelength: 850nm
         eeprom-checksum: good
                  eeprom: 0000: 03 04 07 10 00 00 00 00  00 00 00 06 67 00 00 00  ........ ....g...
                          0010: 08 03 00 0f 4d 69 6b 72  6f 54 69 6b 20 20 20 20  ....Mikr oTik    
                          0020: 20 20 20 20 00 00 90 65  53 2b 41 30 30 30 30 35      ...e S+A00005
                          0030: 20 20 20 20 20 20 20 20  41 20 20 20 03 52 00 6e           A   .R.n
                          0040: 00 1a 00 00 4d 49 53 53  52 4a 32 32 30 31 31 20  ....MISS RJ22011 
                          0050: 20 20 20 20 31 38 31 30  31 39 20 20 00 f0 03 ef      1810 19  ....
                          0060: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          0070: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          0080: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          0090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          00a0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          00b0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          00c0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          00d0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          00e0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
                          00f0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........
After the reboot I see errors:
[admin@xxxx] /interface ethernet switch> print stats
                name:        switch1 switch2
      driver-rx-byte:              0       0
    driver-rx-packet:              0       0
      driver-tx-byte:              0       0
    driver-tx-packet:              0       0
            rx-bytes:              0       0
           rx-packet:              0       0
        rx-too-short:              4       0
               rx-64:        608 256       0
           rx-65-127:          6 976       0
          rx-128-255:  2 130 911 744       0
          rx-256-511:              0       0
         rx-512-1023:              0       0
        rx-1024-1518:              0       0
         rx-1519-max:              0       0
         rx-too-long:            904       0
        rx-broadcast:  2 130 911 004       0
            rx-pause:              0       0
        rx-multicast:            310       0
        rx-fcs-error:  2 130 911 600       0
      rx-align-error:            936       0
         rx-fragment:            920       0
     rx-length-error:              0       0
           rx-jabber:              0       0
             rx-drop:              0       0
            tx-bytes:              0       0
           tx-packet:              0       0
        tx-broadcast:              0       0
            tx-pause:        608 256       0
        tx-multicast:              0       0
Still strange because there is nothing using switch1...

I will check for RX DROPS and TX PAUSE over the coming days on the SFP+ port.

Cheers,
Harry

Re: RB4011 drops and snmp

Posted: Fri Apr 17, 2020 7:49 am
by meetriks2
Hi All,

After somedays:
No more TX Pause which makes sense because flowcontrol is disabeld, so this is solved.
Still RX Drops, well I did not expected the new version to be able to handle the traffic any better... But still the load is not high.

The switch stats still look bad. I upgraded the FW to 6.48.8, so it its full update to date.
Straight after a reboot:
ROUTER.png
I have no ports active on switch1, so I tried by activation one port and reboot again.
After the reboot all the numbers for switch 1 stay 0, as expected. So i guess i hit some kind of bug.

Cheers,
Harry

Re: RB4011 drops and snmp

Posted: Fri Apr 17, 2020 8:08 am
by meetriks2
Hi All,

I'm still stuck which my first question:
[admin@xxx] /interface ethernet> print stats
                      name:  ether1 ether2 ether3 ether4 ether5 ether6 ether7 ether8 ether9 ether10  sfp-sfpplus1
            driver-rx-byte:       0      0      0      0      0      0      0      0      0       0 5 400 604 780
          driver-rx-packet:       0      0      0      0      0      0      0      0      0       0     4 838 185
            driver-tx-byte:       0      0      0      0      0      0      0      0      0       0 5 415 970 941
          driver-tx-packet:       0      0      0      0      0      0      0      0      0       0     4 838 696
                  rx-bytes:       0      0      0      0      0      0      0      0      0       0 5 313 459 846
                 rx-packet:                                                                             4 838 139
              rx-too-short:                                                                                     0
                     rx-64:                                                                                 7 952
                 rx-65-127:                                                                             1 143 759
                rx-128-255:                                                                                88 896
                rx-256-511:                                                                                71 738
               rx-512-1023:                                                                                48 297
              rx-1024-1518:                                                                               332 732
               rx-1519-max:                                                                             3 146 736
               rx-too-long:                                                                                     0
                rx-unicast:       0      0      0      0      0      0      0      0      0       0
              rx-broadcast:       0      0      0      0      0      0      0      0      0       0         5 490
                  rx-pause:       0      0      0      0      0      0      0      0      0       0             0
              rx-multicast:       0      0      0      0      0      0      0      0      0       0        14 144
              rx-fcs-error:       0      0      0      0      0      0      0      0      0       0             0
            rx-align-error:                                                                                     0
               rx-fragment:       0      0      0      0      0      0      0      0      0       0             0
             rx-unknown-op:       0      0      0      0      0      0      0      0      0       0
           rx-length-error:                                                                                     0
             rx-code-error:       0      0      0      0      0      0      0      0      0       0
                 rx-jabber:       0      0      0      0      0      0      0      0      0       0             0
                   rx-drop:       0      0      0      0      0      0      0      0      0       0         1 971
                  tx-bytes:       0      0      0      0      0      0      0      0      0       0 5 329 613 492
                 tx-packet:                                                                             4 838 696
                tx-unicast:       0      0      0      0      0      0      0      0      0       0
              tx-broadcast:       0      0      0      0      0      0      0      0      0       0        23 007
                  tx-pause:       0      0      0      0      0      0      0      0      0       0             0
              tx-multicast:       0      0      0      0      0      0      0      0      0       0           296
              tx-collision:       0      0      0      0      0      0      0      0      0       0
    tx-excessive-collision:       0      0      0      0      0      0      0      0      0       0
     tx-multiple-collision:       0      0      0      0      0      0      0      0      0       0
       tx-single-collision:       0      0      0      0      0      0      0      0      0       0
               tx-deferred:       0      0      0      0      0      0      0      0      0       0
         tx-late-collision:       0      0      0      0      0      0      0      0      0       0
                   tx-drop:       0      0      0      0      0      0      0      0      0       0
                  tx-rx-64:       0      0      0      0      0      0      0      0      0       0
              tx-rx-65-127:       0      0      0      0      0      0      0      0      0       0
             tx-rx-128-255:       0      0      0      0      0      0      0      0      0       0
             tx-rx-256-511:       0      0      0      0      0      0      0      0      0       0
            tx-rx-512-1023:       0      0      0      0      0      0      0      0      0       0
           tx-rx-1024-1518:       0      0      0      0      0      0      0      0      0       0

My problem is how to monitor the drops? “print oid” leads to values from the main interface.
It seems there are no snmp oid for the Ethernet part. I did a snmpwalk and grep on the number and could not find any result.
Should the main/root interface not show the same amount?

Any body have the same problem? Did you solve it ? :-)

Cheers,
Harry

Re: RB4011 drops and snmp

Posted: Fri Apr 17, 2020 12:56 pm
by EdPa
You can monitor Ethernet statistics with MIKROTIK-MIB. For example, try to run the snmpwalk using this OID - 1.3.6.1.4.1.14988, the "rx-drop" should be on 1.3.6.1.4.1.14988.1.1.14.1.1.55.x OIDs. The rx-drop might appear when the interface is receiving more data than the resources of the device can process, e.g. total CPU or even single CPU core is running near 100% at the given time.

The bogus switch stats seems to appear when all interfaces under this switch are disabled and every time you enter the "print stats" command, it returns some more faulty counters. We look forward to improve this in further RouterOS releases, thanks for sharing the details.

Re: RB4011 drops and snmp

Posted: Sat Apr 18, 2020 9:24 am
by meetriks2
Hi EdPa,

Indeed your snmp oid works. I didn't know that the snmpwalk would not reveal them. Thanks, this solves my monitoring issue.
[root@monitor2 ~]# snmpwalk -v2c -c public xxx | grep 273
[root@monitor2 ~]# snmpwalk -v2c -c public xxx 1.3.6.1.4.1.14988 | grep 273
SNMPv2-SMI::enterprises.14988.1.1.14.1.1.55.11 = Counter64: 27370
/interface print oid exists, but /interface ethernet print oid doesn't. Perhaps something for the future.

I will now focus on the drop part. I understand that no connection tracking is slower than using fastpath.
Unclear about fasttrack.

Cheers,
Harry
You can monitor Ethernet statistics with MIKROTIK-MIB. For example, try to run the snmpwalk using this OID - 1.3.6.1.4.1.14988, the "rx-drop" should be on 1.3.6.1.4.1.14988.1.1.14.1.1.55.x OIDs. The rx-drop might appear when the interface is receiving more data than the resources of the device can process, e.g. total CPU or even single CPU core is running near 100% at the given time.

The bogus switch stats seems to appear when all interfaces under this switch are disabled and every time you enter the "print stats" command, it returns some more faulty counters. We look forward to improve this in further RouterOS releases, thanks for sharing the details.

Re: RB4011 drops and snmp

Posted: Fri Apr 24, 2020 4:07 am
by kical
Hi
Im having that Issue too on many RB4011 I have deployed
having many RX Drops on the interfaces making the network to have packet loss

At first instance I thought It was my Ethernet Cables or SFP links of the switches but I changed 1 of our towerrouter with a CCR1009 and the problem of packet loss is gone
But the problem I only see it on the RB4011 the Wifi Model and the non Wifi Model I think is something about software on the switch chip, the routers CPU is about 15%

If any mikrotik support team needs our suppout.rif of all the routers 4011 we got I can share them for investigation.

Re: RB4011 drops and snmp

Posted: Wed Apr 29, 2020 6:59 pm
by meetriks2
Hi,

I changed the configuration to fast path instead of connection tracking disabled.
This clearly lowers the number of rx drops over a period of time. About a factor 10 in my case.

Based on the graphs
graph.png
5 min avg highest traffic peak is 800 mbit, 400 in and 400 out.
30 min 1 week average speed is 114 mbit, 57 in and 57 out.

So the drops are happening on not a very busy router. I do not known how big the rx buffer is in the rb4011.
Model RB4011iGS+ (non wifi)

FastPath
How do you secure the router while running with fast path? I can't have any firewall rules in the router...
I have tons of log message of denied winbox settings.
I have setup allowed from on the winbox service.

The last years have showed that the winbox port should not be accessably to the outside world...
Using a different port doesn't really improve security.

So it is speed or secure?

Cheers,
Harry