Community discussions

MikroTik App
 
stigger
just joined
Topic Author
Posts: 17
Joined: Wed Mar 02, 2016 1:45 pm

RB5009: slow switching under certain conditions

Thu Feb 01, 2024 4:39 am

I have an RB5009UG+S+ with RouterOS 7.13.3, with a literally default configuration:
/system/reset-configuration caps-mode=no keep-users=yes no-defaults=no skip-backup=yes

A linux server is plugged into the SFP+ port via DAC, a TP-Link access point is plugged into ether2 (I tried EAP610 and EAP245 models, makes no difference).

The problem is that the Wi-Fi transfer speed from the server through that access point is limited to ~200 Mbit/sec over TCP. Transfer speeds over UDP give 700-800 Mbit/sec, as expected.

Here is what I've tried so far:
  • Replacing the RB5009UG+S+ with a CRS305-1G-4S+ fixes the problem, I see 700-800 Mbit/sec over TCP
  • Plugging a laptop directly via ethernet instead of the access point gives the full 1G speed
  • Limiting the link speed to the server to 1G "fixes" the problem, 700-800 Mbit/sec
  • If I plug the CRS305 between the RB5009 and the server, then setting either of the two 10G links to 1G results in proper speeds
  • With the 10G link, using "iperf3 -c server -b 300M -R" gives 300 Mbit, whereas "iperf3 -c server -b 400M -R" gives only 330.
  • "/interface/bridge/port/set hw=no" on either of the two ports gives ~400 Mbit/sec. The CPU is as good as idle, that's not the limiting factor in that configuration either.

At this point, I'm out of ideas, but I tend to blame the RB5009UG+S+ rather than the access points, despite the fact that I get a full gigabit when plugging the laptop directly over ethernet. Is there anything else I could try?

Here is a messy pcap, if anyone's interested.

Image
 
arm920t
Frequent Visitor
Frequent Visitor
Posts: 68
Joined: Sat Aug 03, 2019 8:02 am

Re: RB5009: slow switching under certain conditions

Thu Feb 01, 2024 6:41 am

This is a known issue with 10G/2.5G/1G multi-rate switch network especially from high-speed ports to low-speed ports. The data rate of a 10G port is too fast, and the buffer is easily overloaded, resulting in packet loss. This causes a lot of TCP retransmission, so congestion avoidance slows things down. If you test iperf3 through multiple flows, you'll find a lot of "Retr", which means packet loss and retransmission. Enable flow control on both sides and check whether the problem is rectified. If not, it may be some bug in the network driver.
 
stigger
just joined
Topic Author
Posts: 17
Joined: Wed Mar 02, 2016 1:45 pm

Re: RB5009: slow switching under certain conditions

Thu Feb 01, 2024 2:55 pm

Forgot to mention: yes, I did try enabling flow control, but that did not help. Besides, if the switch buffer overflow is the explanation, there are two (well, at least one) contradictions here:
  • Works with CRS305. Theoretically could be explained by a larger buffer, but still suspicious.
  • The problem still reproduces if the speed is limited at the application level: "iperf3 -c server -b 400M -R". If it is indeed the buffer overflow, then it should not matter how the speed is limited: via link set to 1G, or by the application itself, right?
Last edited by stigger on Thu Feb 01, 2024 4:52 pm, edited 1 time in total.
 
ToTheCLI
Frequent Visitor
Frequent Visitor
Posts: 98
Joined: Mon Jan 04, 2016 3:54 am

Re: RB5009: slow switching under certain conditions

Thu Feb 01, 2024 3:22 pm

Similar issue here: viewtopic.php?t=199051
Open a bug report with support.
 
soooc
newbie
Posts: 28
Joined: Thu Mar 10, 2011 1:51 pm

Re: RB5009: slow switching under certain conditions

Tue Apr 02, 2024 10:51 pm

Trouble is in HW - Marvel 88E6393X is bad chip - small 2Mbit buffer. If you count, you need at least 8Mbit. CRS305-1G-4S+IN uses 98DX3236 with !!! 24Mbit !!! buffer.

Mikrotik can try emulate buffer by software, but it is bad way.
 
R1CH
Forum Guru
Forum Guru
Posts: 1108
Joined: Sun Oct 01, 2006 11:44 pm

Re: RB5009: slow switching under certain conditions

Wed Apr 03, 2024 7:05 pm

You can also try setting up a qdisc on the Linux server to limit each host to 1gbps, essentially rate limiting how fast the data comes into the RB5009 to work around the broken switch chip.