Community discussions

MikroTik App
 
dcdorsey777
just joined
Topic Author
Posts: 3
Joined: Mon Jun 03, 2013 9:05 pm

High packet loss switching UDP traffic

Thu Sep 22, 2016 3:05 am

I’ve seeing very high packet losses (>80%) on several different Mikrotik products (RB, CRS and CCR models, see list below) when sending UDP traffic from a 1GbE segment to a 100BaseT segment.

Background
I have a small WAN for a fire department, with a central administration building where my data center and public internet connection is located. We have Metro Ethernet (MOE) circuits connecting each of 6 fire stations to admin. Five run at 50mbps and one at 100mbps. At admin I use a CCR1036 as my core server, with a CCR1009 to connect to my public Internet connection, with a RB3011 on my primary MOE connection, to split out the two VLANS – one for my public Internet and the other to the subnet that my station routers are on. Each station also has a CCR1009 to connect it to the MOE.

Video training (both live and recorded) is getting to be very important here, so I’m working on tuning my network for better UDP-based video and audio service, primary Webex.

Test setup
I started by setting up a test configuration in my lab using another RB3011. I configured it with a NAT and put a PC and an RB750G router behind it as my simulated private LAN. The public side is my production network. I set the RB3011’s downstream interface to 100mbps to more-or-less simulate my WAN throughput.

I started running iperf, sending UDP traffic from a PC on my “private” test network (100mbps) to a PC on the production network (GbE). All good. I could run bandwidths up to 80Mbps with no problem. Then I tried the other way, sending from the PC on my public (GbE) network to my “private” PC. In this direction, packet losses were 80% or more at 40Mbps. Cutting down to 20Mbps dropped the packet losses to about 25%. Changing the iperf buffer size (-w parameter) to 100 dropped packet losses down to 2-4%. To repeat, going the other way I had no problem. This was pretty weird.

So, basically, when going from a GbE network to a 100BaseT network, running UDP, packet losses were enormous. TCP traffic did not lose traffic.

I first suspected the NAT, of course. I also learned that some Windows PCs appear to drop UDP packets on the receiving end of an iperf test, somewhere in the networking stack. I used PCs directly connected to each other and tested until I found two that could run iperf both ways with no losses, at up to 80Mbps.

I went through a lot of trial and error to isolate what was causing the packet loss: iperf test parameters, NAT, routing, drivers on the PCs, etc.

Simplifying
One by one I took pieces out of the test scenario, until my configuration was the following:

PC 1 ------- (1Gbps)Switch(100Mbps) ------- PC 2

The “switch” was either a Routerboard with all ports slaved to one so that only the switch chip was engaged, or a GbE switch from another vendor (see below). The 1Gbps port was left to Autonegotiate with the Gb port on the PC while the 100mbps port (on the Mikrotiks) was tried both hard-coded to 100mbps, or autonegotiating to a 100mbps port on the PC.

New Procedure:

Step 1:
Ran iperf tests using UDP, at various speeds from 1Mbps to 80Mbps from PC 2  PC 1 Result: almost no packet loss at any bandwidth.

Step 2:
Ran iperf tests using UDP from PC 1  PC2 had packet losses of 1-2% for 1Mbps, 25% for 20Mbps, 60% for 40Mbps, up to 98% for 80Mbps. I could confirm that there were real bandwidth losses through the switch, because I looked at the interface traffic on the switch under test. The speed arriving at the 1Gb interface would be the selected iperf test speed, but the speed leaving the switch on the 100mbps side would be 12-16mbps, regardless of the iperf speed requested.

Hypothesis:
I suspected maybe this was just something that happens with UDP – that is, packets are hitting the switch at 1Gbps, then are being retransmitted at 100mbps. So it makes sense that many might get lost, even though the average throughput over time is less than 100mbps.

If this is something that a switch is not supposed to be able to handle, then *all* switches should have this behavior. Let’s see.

Step 3:
So, I tried some other GbE switches, including several Mikrotik Routerboard boxes, a Linksys and a Netgear. All the Routerboards had the problem, except for a simple SwOS switch, RB250GS. The Linksys and the Netgear switches (nothing high-end, just unmanaged Gb switches) had zero packet loss switching from 1GbE to 100mbps, either direction, even at 80Mbps speeds.

Step 4:
Ask me if you want more details, but to summarize, I tried adjusting all of the following on the Ethernet interfaces in use: MTU size, L2 MTU size, interface queue type, flow control, hard-coded vs autonegotiated connection speed and duplex. I tried applying simple queues to the interfaces.

Step 5:
I ran bandwidth tests using the tester built into RouterOS, setting up a separate router on each end of the connections above, in place of the PCs. Interestingly, the RouterOS bandwidth test didn’t have any problems.

Question:
So, is there a bug in the Mikrotik switching fabric that keeps them from being able to switch UDP traffic (iperf, at least) from a GbE segment to a 100BaseT segment? It seems weird that a couple of low-end devices from other manufacturers do it just fine. Also, could this be related to some of the problems that have been report elsewhere, having to do with linking segments of different speeds?

http://forum.mikrotik.com/viewtopic.php?t=81936

Please help:
I’m sure I’ve made some mistakes in my technique here, but I’m pretty sure this is real.

If this question attracts interest, I’ll be happy to post actual configuration scripts and test output. But this is long enough already. And maybe there’s something obvious that I missed.

More detail:
Below are the Mikrotik models, RouterOS releases and Routerboard firmware I used, and the models of Linksys and Netgear switches. Sorry I wasn’t able to test with other models of switches – these were all I had on hand.

Switch / RouterOS / Firmware / Result (Fails = drops UPD packets)
Mikrotik CRS125-24G-1S-RM (switch chip: QCA8513L) / 6.36.3 / 3.24 / Fails

Netgear ProSafe 5-port Gigabit Switch GS105 / / OK
Mikrotik RB3011-UiAS-RM (QCA-8337) / 6.36.3 / 3.27 / Fails
Mikrotik Routerboard 250GS (switch chip??) / (SwOS) 1.17 / / OK
Mikrotik RB750G (Athenos 8316) / 6.63.3 / 2.39 / Fails
Mikrotik RB951G-2HnD (Atheros 83270) / 6.26.2 / 3.24 / Fails
Linksys SR2024C / / / OK
 
skuykend
Member Candidate
Member Candidate
Posts: 274
Joined: Tue Oct 06, 2015 7:28 am

Re: High packet loss switching UDP traffic

Thu Oct 13, 2016 12:30 am

UDP is a connectionless, unreliable protocol. It's supposed to work that way. There is no flow control mechanism, so other than buffering a very few packets for a few miliseconds, all it can do is drop them. 1gig to 100mbps your going to drop around 90%.
 
sup5
Member
Member
Posts: 359
Joined: Sat Jul 10, 2010 12:37 am

Re: High packet loss switching UDP traffic

Thu Oct 13, 2016 12:40 am

dcdorsey777,

1) how did you interconnect the 1000M and the 100M port?
a) via a bridge-ports
b) via the switch-chip using the master-port setting

2) did you try to toy around with various interface queue types and buffer depths?
You might change from "hardware-only-queue" to pfifo or something else and increase the buffer.
 
skuykend
Member Candidate
Member Candidate
Posts: 274
Joined: Tue Oct 06, 2015 7:28 am

Re: High packet loss switching UDP traffic

Thu Oct 13, 2016 3:43 am

Looks like some routers and switches may use a pause frame which was created for another issue, and can have some negative issues as well. This could explain why some routers/switches don't seem to have as many dropped packets.

https://en.wikipedia.org/wiki/Ethernet_flow_control

Try turning on flow control in the Ethernet menu (off on mine by default for TX and RX) and see if that helps. May cause other issues as not all traffic needs to be blocked in these circumstances.