This is one of my use cases where queuing is really really important. Can you give short example for say link of 5M down and 800k up (or whatever you want to use)When shaping dsl especially, it's very important to get the link type "framing" right
any tips for LTE connections? Especially ones that go from ~5Mbps to 70Mbps in a few hours?
Don't use them? We get the "how can an end user make LTE generally usable and consistently low latency" question a lot. It's often worse than wifi. We've (bufferbloat.net) been after that entire industry for years now to do better queue management everywhere - the handsets are horrifically overbuffered, the enode-bs as well, the backhaul's both encrypted and underprovisioned...Do you have any tips for LTE connections? Especially ones that go from ~5Mbps to 70Mbps in a few hours? The auto ingress doesn't always act as I'd expect it to, and I'm not sure if it's RouterOS' implementation, or a bug, or me not understanding things.
Various moral equivalents to FQ have long been in play in the rest of the "fixed wireless" market (which consists of a lot of telco folk talking to themselves, rather than recognise that non-5g tech - like most of mikrotiks market - dominates in the field).
As for whether or not you can run an LTE interface at line rate wisely, the state of most of the linux drivers for that were terribly overbuffered, so the amount of backpressure you got was very late. I hope that something like AQL or BQL land for LTE interfaces, and there's some promising work towards actively sensing LTE bandwidth going on over here in the openwrt universe: https://forum.openwrt.org/t/cake-w-adap ... 108848/482
No crashing, I have run the CCR1009 very heavy for several days without issue! Full transparency, tonight I am on the RB5009, it just showed up yesterday so I have been toying with it. So, I will be using it for my testing tonight. I can always swap around if you would like. Either way, they are both running 7.1 Stable.Thx so much for testing. I have a low standard right now... "does it crash?", so far, so good.
Your first result, sans cake, was really quite good, and indicates your AT&T link has only about 20ms of buffering in it, or so. Believe it or not, that's actually "underbuffered" by prior standards, and makes it harder for a single flow to sustain full rate. But: a little underbuffering is totally fine by me, and I don't care all that much if a single flow is unable to achieve full rate, I'd rather have low latency.
It's easier to determine the buffer depth via a single upload test like this:
flent -x --step-size=.05 --socket-stats -t the_options_you_are_testing --te=upload_streams=1 -H the_closest_server tcp_nup
Use the gui to print the "tcp_rtt" stats. If you use the -t option to name your different runs, you can also do comparison plots via "add other data files" in flent-gui.
there are servers in atlanta and in fremont, california, if either of those would be closer for you.
HAH, I was hoping to pique your interest Science incoming!OK, ok, I gave in, in order to do science, could you also try a tcp_nup with upload_streams=4? and =16?
The Test 1 *appears* to show an old issue raising it's head - tcp global synchronization - the amount of queue is so short that all the flows synchronize and drop simultaneously, as per panel 3 of your first plot, but in order to do "science" here, simplifying the test to just uploads would help.
Secondly it appears that something on the path is treating the CS1 codepoint as higher priority than the CS0 codepoint, when CS1 is supposed to be "background".
I am not sure if the VDSL device does or not to be honest. It is an ATT branded box model BGW210. I have it in passthrough mode, but as stated above it is still some black magic NAT but 'passes' the public IP to my router.Does that VDSL device do hardware flow control? Or are you shaping via cake via htb? (I'm happy to hear the bandwidth=0 parameter seems to be working otherwise?), but the only way I can think of you getting results this good is if the vdsl modem is exerting flow control....
Anyway, your last result is a clear win over what you had before, methinks. I'd like a tcp_nup test of that config too, when you find the time.
I will need to do much more studying to find the answer to question 1. =) I assume it will atleast partially have to do with the RTT and bandwidth as part of the equation.OK.
0) Still mostly very happy it doesn't crash.
1) Your dsl device's buffer is sized in packets, not bytes. The reason we only saw a 20ms RTT before on the rrul test, vs a vs the tcp-nup test being so much larger RTT, is that the acks from the return flows on the path filled up the queue also. I leave it as an exercise now for the reader to calculate the packet buffer length on this device...
2) I figured I was either looking a shaper above cake, or at dsl flow control .(I like hw flow ontrol, btw, I was perpetually showing off an ancient dsl modem with a 4 packet buffer and hw flowcontrol + fq-codel in the early days, as FQ = the time based AQM vs a fifo worked with that beautifully and cost 99% less cpu to do that way. Sadly most dsl modems moved to a switch and don't provide that backpressure anymore. Not quite sure you just tested that without a shaper.
3) Do want to verify you are not using BBR on your client? The 5ms simultaneous drops are still a mite puzzling.
OK, no bandwidth shaping, and the following cake config -- and tcp_nup tests..i do dream of hardware flow control, so no shaper, bandwidth=0 for cake as a tcp_nup test.
But i expect to be unlucky. Anyway, your fiddling with the frame parameters without a cake shaper active should have done nothing (I think), so that run was puzzling...
cake nat besteffort the_right_dsl_option bandwidth XMbit easiest to reason about. Do you have visibiity into the sync rate of the modem? Anyway, get that number right next then try
tcp_ndown... Note you cannot measure tcp rtt from this direction via flent directly, so we resort to inference or packet captures.
At some point I might ask you to stick your *.flent.gz files somewhere. Pleased to have so vastly improved tcp rtt.
Roger that, I figured no hw control was the case.You don't have hw flow control.
Nice to know (I guess) that BBR2 still struggles with itself. Try resetting that to cubic on the up, please, and shape to 19
add ack-filter to the up
I'm running cubic on that server for the down.
Your baseline rtt might drop in half without bonding OR if you can disable interleaving (yes, as well as your bandwidth).
I am not sure what you mean by scrape the rate? Do you mean change the bandwidth limit real time during a test, or possibly using it as part of a script to help automate testing using the API?Is it possible to scrape that rate? cake supports dynamically changing it's config *without* reloading the qdisc, but I doubt mikrotik can do that with their api (?) tc qdisc change dev whatever cake bandwidth the_new_bandwidth. You should be able to get really close to the actual uplink rate (22xxxkps) with the right framing. Those little ping spikes are a bit puzzlng (something out of band like ppp-oe?) I note some dhcp and some ppp messages now exist in some implementations that actually do send the link and/or shaped rate and framing.
Your download was really pretty. But anyway, I'd like to solidify the upload using cubic at 19mbit first, ack-filter on (I worry about that option), then I'd love to see sfq (unshaped and shaped) to the same rate with both bbrv2 and cubic - We are kinda getting down to attempting rigorous science here, so perhaps scripting, and some packet captures are in order. On the other hand, if you can keep the
tested options straight in the -t option we can easily compare things later. I have a long standing hypothesis that since SFQ was so popular in the wisp markets, (ubnt uses it), and I long ago proved it was too short to sustain fat tcp flows, that it was acting as an AQM also in this market, which is why the observed bufferbloat was only in the 80-100ms range, and as people started shaping to faster and faster rates and using 8+ multiflow speedtests, didn't notice they were killing single flow tcp performance. ( https://www.bufferbloat.net/projects/bl ... _Must_Die/ ). The poor results I'd got then however, predate the advent of the linux stack's pacing and single flows have actually been scaling higher than 12mbit since against sfq's default 128 packet limit.
The reason why the rrul upload looks spotty is actually more related to sampling error, not an actual problem per se', and you are also zoomed way in. You can scale plots relative to each other as you wish, or combine them, via flent. I *like* to zoom in but try to stay cognizant of the scale, and there's a version of the plot that won't zoom on you, also.
Somewhat puzzled about the QoS stuff, but I'd rather get the bandwidth param right first. I note I'm not a huge fan of QoS in the first place due to all the differing interpretrations, and there was also a bug in some version or another that wasn't readng the dscp field properly with some encapsulations. cake has a "wash" option if you are actually seeing mismarks on ingress, or are doing something special on egress that you don't want upstream to see. i do keep hoping we can "export" a standards compliant diffserv set in the hope that the ISP might respect it, and vice versa...
The rrul test is a *stress* test using greedy traffic and not indicative of the intent of QoS. Were it to be more representative, it would send voip-like isochronous traffic through the VI queue, videoconferencing 16ms frame-like traffic through the video queue, and something torrent like through the background queue. It semi-intentiionally and semi as a mistake, only excercises 3 of the 4 cake diffserv4 or wifi hw queues, rrulv2 does this more right, haven't finished the spec yet.
Demonstrating the sad results of sending greedy traffic through a qos system that *thinks* its traffic was going to obey the rules was also on my mind at the time. You still see a lot of strict priority queues out there where if one user lucks into the right dscp marking, they can starve out everyone else. Cake's game theory here uses soft admission control so that that doesn't happen, and in general shows the benefits of short queues and 5 tuple fair queuing over any form of qos, and furthermore does per host fq, so the worst a user can do is do themselves in not everybody else.
There are 110 other tests in the suite, i've got rather good at reading the rrul test over the years, it's the way to get a picture with the least amount of effort, then we do the tcp_nup and down tests. I might not have needed to suggest that had I not noticed that it looked like you were running BBR. The square wave tests are useful,, as are the various _var versions which let you test different servers.
I have given up asking her how the internet is doing.. she is very binary. It either works or it doesn't. AHAHA!!!I think she'll be happy with your efforts so far.
Agreed, I just ran it again.. and got similar results as before. Almost identical, the speed is wonky, however latency is still great.your 16up result seems kind of anomalous.
Ahh ok, gotcha! Well, that is the interesting thing not sure you noticed that last go around I had set the rate to 22 to match the rate in the modem, and it appears to be good. I assume you say keep it at 19 to give myself some headroom in case that rate were to drop in the future?by "scraping the rate" I meant rolling some sort of script to pull it off the modems sync rate, but since your isp is shaping you instead, stick to the 19.
Good point, not to mention I thought about it afterwards.. even thought the sync rate is 22, the ISP is obviously holding me at 20. So, as not to let them be the bottleneck, 19 makes sense in that case as well.no, I didn't notice. 19 makes my head hurt less for now? In general dsl tends to fluxuate in rain, over the course of a day, etc, so leaving yourself headroom is a good idea.
Hmm, well after a fresh reboot.. the results are the same for the up16 and cake. This is odd. Per the Mikrotik 7.1 release changelog, it is running 5.6.3, and this router has a quad core 350-1400 (auto) MHz arm64 chip. Looking up the model of the CPU, it appears to be a Marvell ARMADA 7040 https://www.marvell.com/content/dam/mar ... 017-12.pdfas for the up16 anomaly, try htb + fq_codel...
And at some point, when your gf is not looking, reboot and try cake again at up16? I return to my initial objective, not crashing. This is 5.6.x? cpu arch?
Just for the heck of it.. I added some more data here.. I added 8 and 32. It looks like even with 8 it starts to drop.. and gets worse with more, however 16 and 32 are roughly the same.I'm kinda hoping this is a bug in flent!!!
I am pretty sure you have the overhead right at this point. I'm also happy to see it not crash.
In the interest of science, however, if at some point you could also repeat the 4up test with htb + fq_codel, that would be interesting. Also if you were to enable ecn for a fq_codel vs cake comparison on your client.
While we definitely get more throughput and less FQ latency from cake, with more bounded results from that side
bothersome2.png
cake's "Cobalt" AQM tcp RTT is oscillating far more than I would have expected. SFQ's overly short buffers are winning pretty good here.
bothersome.png
Very cool! Hopefully this week, I will be getting my brother's new Mikrotik router installed and testing wireguard between his house and mine.I'm off researching kernel versions. NOT relevant to this was the wireguard patch that went into 5.7.
https://github.com/dtaht/sch_cake/issue ... -984503893
If you have a mikrotik account (I am not a mikrotik customer), and can file a bug, I'm a bit concerned.
I wouldn't mind, however, returning to testing downloads. Your 16 flow download was perfect...
Yes sir, here ya go.. The plot thickens! =PI'd wanted a tcp rtt plot for the 4up test also. You can recreate my cdf if you like comparing the sfq vs cake vs fq-codel.
Here ya go!also, 8, 16, 32 with SFQ?
Yeah, bugs are no fun! The possible packet reordering makes sense because of the interleaving.I hate bugs. :/ Anyway, a packet capture of the 16 flow test would be good at this point.
tcpdump -i your-interface -s 128 -w 16flowscake.cap
We'd never tested bonding until today... and I could imagine us having a lot of packet reordering in a variety of ways.
Assuming this is a bug that isn't in flent - it's one of those darn things that didn't show up in testing because we didn't stress things hard enough. The weird thing is I keep seeing artifacts in the latest release of all this stuff in newer kernels, that don't match the kinds of results we were getting when we first mainlined this code. https://forum.openwrt.org/t/validating- ... /111123/10
is one example.
Can't even rule out a bug in your host's tcp. WI have a research group that can take a look at this, try to reproduce.
ANYWAY. At least it doesn't crash and you have consistently low latency, and probably rarely stress out a box this hard. thx so much for being all over this with me!
WOOHOO! =) Glad we are moving in the right direction now, and you are a mind reader, I sure did.. it was set to 2.That's MUCH more correct looking, thank you!
Next, to see if ecn is working properly, you can run the exact same test series, but with:
sudo sysctl -w net.ipv4.tcp_ecn=1
I use ecn primarily as an AQM debugging tool (given how rarely it's turned on in the field) and for all I know (without seeing the capture) you had it on just now.
Ahh yes, filtering the other flows is a good idea! I have used wireshark at an elementary level.. first time using xplot, I must use it and learn it better! Well, both tools for that matter. That is pretty awesome being able to see the sack blocks. So that first trace was with ECN=2 which was out of the box on this install.. and this last run was with ECN=1. I agree with ya on the 32, I just figured I would throw it in there on these runs to see what happens. It is not crushing as bad as before though, so that is cool too.Thank you for the packet capture. You can, btw, filter out all your other traffic by specifying "host dallas.starlink.taht.net"
This is the correct sort of carnage that cubic does, there's retransmits, dup acks, out of order stuff - strangely comforting after puzzling over that last capture all night!
correct_cubic_carnage.png
The xplot equivalent of this plot is prettier (IMHO), and in either case, if you zoom in, you can see the "sack" blocks in the bottom line, showing loss and recovery.
With ECN enabled you won't see sacks (except when there is actual loss), CE's and CWRs.
PS I'm not too concerned about the performance dropoff at 32 flows in that we are pounding the link flat and loss going up geometrically, however if it returns or gets worse... my concern was some sort of memory leak hurting the mikrotik box, well I had a lot of concerns! Thx so much for the exhaustive testing, again.
I have a decent understanding of TCP, and I understand what you are saying about the tail loss, especially from the video I found the other day with you using the people as packets. I just pulled up the RFC and the other link. Awesome! I have some homework tonight =)To summarize a few things. Yesterday we ended up in a state where a bunch of flows weren't even going through the host at the right rate, so we weren't stress testing the qdisc, and thus not seeing any difference in latency between the three different qdiscs under test. It was seeing SFQ act the same as all the other ones as we added load, that made me scratch my head - and you blow your machine away entirely! thx. It felt pretty good to me, too. Anway... I'm sitting here overfocused on making sure the mikrotik is working right, and whilst I am VERY intereted in captures and BBRv1 and BBRv2 behavior, in the context of this thread I just want to make sure mikrotik has got these new qdiscs working exactly right. My long term goal is that fq-codel in particular, go on by default on all interfaces in this or some future mikrotik release...
"So you mentioned the cubic carnage, please excuse my ignorance, but I assume that is the best out of any we could use? I take it at least that it better than BBR? Or maybe better said.. 'better' in this particular environment. I know some tools are better than others depending on the use case."
I'm enjoying very much sharing my tcp knowledge with you whilst you test. i might end up giving some reading assignments though...
tcp is designed, in the end, to be able to reliably carry packets via any means or combination of circumstances possible, as per rfc2549, which is a good read.
So when I said "cubic carnage" I was mostly being allerative. I've seen MUCH MUCH worse, and was actually expecting significant episodes of reordering from the bonded link, but didn't. Anyway, by
eyeball that was the correct behavior of cake and cubic together.
As you pound more and more flows through a link, (or you have a shorter and shorter buffer) we start hitting another phase of tcp (slow start and congestion avoidance are what i usally talk about, but i do allude to this in my apnic talk), we lose so many packets that we trigger tail loss and a 250ms RTO ("hello are you still there?"0, which is an even more extreme form of congestion control (it completely resets the tcp window also). This is probably the cause of the ever "long tail" above the 99th percentile of the cdf plot as you add more and more flows. Add 64 flows, 128 flows, eventually flows won't even be able to get started...
This was pretty good: https://blog.apnic.net/2018/03/19/strik ... cillation/
WOW, thank you for all of that! I have many tabs in my browser to read now =) That is interesting that these other mechanisms work great for big fat single flows.. and not others. As if the dev's never talked to anyone else to see what the end user might do other than just watch videos all day!OK, it's back up. ECN neg is enabled (but the bits could be getting washed out on the path, OR I'd disabled it on the previous boot).
To go to your BBR vs cubic question. :lecture mode:
TCP reno was the "internet standard" for a long time. It had a "sawtooth", and an initial window of 2, and couldn't scale past some X mbits
circa 2006-2008 a bunch of things happened - the linux txqueuelen went from 100 to 1000, TSO (up to 42 packets in a single offload), appeared, and window "scaling" started to deploy,
and linux switched to tcp cubic, and wifi added packet aggregation...
the first was just... dumb, the second, a desparate attempt to make tcp saturate a wire better against weak cpus at the time (which it did) and window scaling, to make TCP scale to gbits and beyond, and cubic looked and was faster to grab bandwidth while seemingly doing no harm because of problems 1,2,3 not being well understood yet, and wifi aggregation not at all.
To compound things further, Linux went to IW10 to make the web server folk happy in 2010 ( https://tools.ietf.org/id/draft-gettys- ... ul-00.html ) ... everyone added more buffering to the modems ... failed to understand what bittorrent's real problem was...
and then we started noticing that classic voip and videoconferecing apps like skype, were not working anymore. Enter jim gettys, having his kids yell at him for transfering files to mit. https://gettys.wordpress.com/category/bufferbloat/ and me, in nicaragua, scratching my head as to why my internet radio, which had worked for years, had stopped working: http://the-edge.taht.net/post/Did_Buffe ... Net_Radio/
Anyway there's a lot of ranting between 2011 and 2021 I'll elide. BBR emerged from youtubes struggle to find a way to deliver data reliably whilst not ovebuffering overmuch (fast forward and reverse this helps with), and it's a *perfect* transport for streaming a single tcp session of recorded video like netflix (except they thus far haven't made BBR work well on bsd). BBR is better in many respects than cubic, especially if it is FQed (where it mostly lives in it's delay based regime), but it has some unpleasant modes where it dukes it out with cubic (to win), has trouble competing with itself (ideally where we use a sharded website today with 110+ different connections we'd have *one* BBR connection back to the "mainframe", and it doesn't respect ECN, or gentle packet loss and has it's own model of the network that is by god, superior to yours!!
https://queue.acm.org/detail.cfm?id=3022184
Despite my cynicism, I don't like cubic either - reno, what was so wrong with reno, and IW2? I ask.
Anyway, BBRv2 is better than BBRv1, and I'm delighted to see new people trying the shiny stuff in circumstances where the designers didn't think about much, like on a 19Mbit fq-codeled link. And finding bugs. They like packet captures too.
Roger that, back to cakereally large string of wtf moments, there. Can you return to cake? or turn off ecn? or both?
your mq - fq-codel might explain some other things, but not this.
HAH, 20 lashes! *banging head on wall*Well, don't do that then. :O ip is big-endian....
but a good test of fq-codel with ecn disabled would comfort me, first. There should be differences in the overall distribution particularly in the 32 flows test... but throughput should stay flat, not that horrible thing that just happened....
AHAHAH roger that.. here we go! Umm.. well, the results are different.. o.Ook, so if you could don a fire retardant suit, re-enable ecn, and retry cake, and if that looks substantially similar, retry fq-codel?
Right on, I am about to pass out myself. I just uploaded the CAKE+ECN data in the post before this one.. look at it in the morning, so you can get some real sleep! =P HAH, I am just glad today is over with and all those weird bugs I introduced on my own are gone! HAHA@kevinb361 I was up very late yesterday and will sleep soon. I can live with not knowing ecn works before I wake. thx again for going to town on this and making such "interesting" mistakes. It's all data to me, and I think the bug you had on the xanwhatever itwas kernel was rather interesting, as well as the damage seemingly caused by using that iptables rule.
From my interactions with engineers working for much bigger ISPs than the one I work for (where queuing in software is possible), wred is still the gold standard for most large providers. My understanding is that everything Cisco and Juniper is wred. It can handle huge bandwidth amounts due to offloading to the ASIC, but is almost certainly much worse than any of the newer AQM solutions. I believe those running Cisco and Juniper have no ability to even consider codel or fq_codel or cake on the service provider side.Lastly, I do not know how much wred is deployed anymore. 5 tuple FQ - all by itself - seems to be gaining traction.
No rest for the wicked! HAHA@kevinb361
the ecn result is very disturbing. But it could be mikrotik (a checksum failure or parsing the wrong bits on this encapsulation, which was a bug that I can't remember when we fixed in some release of linux and cake), the modem, the path, something at linode, where my server is. Anyway, fq-codel without ecn would be a good comparison to validate fq-codel is also implemented correctly, and a repeat of the SFQ test without that iptables rule would hopefully also be sane.
Sometimes I wonder if I am a masochist.. HAH ok, last test and then I am gonna go count packets until I pass out! =P@kevinb361
the ecn result is very disturbing. But it could be mikrotik (a checksum failure or parsing the wrong bits on this encapsulation, which was a bug that I can't remember when we fixed in some release of linux and cake), the modem, the path, something at linode, where my server is. Anyway, fq-codel without ecn would be a good comparison to validate fq-codel is also implemented correctly, and a repeat of the SFQ test without that iptables rule would hopefully also be sane.
OK, going to try and break this down in chunks as best as possible.since your eye is now "trained" for a fairly short rtt, try fremont.starlink.taht.net or london,singapore, or sydney .starlink.taht.net
we also have tests for these competing against each other, as in the usual case we are not sending flows to a single server.
SFQ will start to underperform at these longer rtts, and I don't honestly know which of cake or fq-codel will win. SFQ is doing really, really well so far, but i suspect it will go to hell on the rrul_be tests, even on the short rtt to dallas, and the way to test multiple sites is via the -H serverA -H serverB -H serverC -H serverD rtt_fair_var , which is also "interesting" on a fifo.
And with that, I really am calling it quits for the day. Very reassuring to see non-ecn work. Some backstory - ECN is not an enabled option for any but a few OSX things, or really advanced linux users, so making it misbehave, on this kernel release (and modem! can't rrul that out) coherently and consistently, rules out a ton of mild "background noise" I've had for about a year now.
This reminds me of many years ago when I first got into messing with this stuff. I would have set queue's I believe using RED? I don't remember anymore.. but anyhow, I would give ACK and DNS top priority with a guaranteed bucket size of whatever. I wish I had kept those configs so I could look back and see how I used to do it. Thank you for the knowledge, the pieces are starting to make more sense now. I have a ridiculous number of tabs open now, and am slowly going through them readingThe FQ component of fq_codel, cake, and fq-pie has what we call the "sparse flow optimization". Request/response (DNS, syn, syn/ack) the first packet of any new flow, acks, voip, gaming, packets, usually "fly through" without observing any queuing at all. In this example we have 32 fat flows, and SFQ would have put the thin flow at the end of that queue - (which is still a LOT better than FIFO and I'd like to use one of those runs on future plots). so in this example, at this 19mbit rate and number of flows, we're consistently saving 3ms of latency and jitter.
consistently_ll.png
While that might seem like a small number, your typical web page might issue 100 dns queries, and 100 syns, and the queuing cost for those, vanishes. Some of that gets amortized by how web pages interleave requests, but not all of it, by far.
Also, because these qdiscs judge "sparseness" by bytes (DRR-like, rather than SFQ-like), not packets, and because the uplink acks are pretty small and sparse also, the queuing cost for much of a web page load time (usually the first 10 round trips per flow) also vanishes. We used to do a demo back in 2013 or so, showing a basic upload workload and how much better web pages behaved with fq-codel in place. (setting up a long saturating workload in flent -l 300 rrul_be - and then a web page benchmarker, demo'd to dan york of the internet society here:
https://circleid.com/posts/20130418_buf ... s_can_be/
To be clear, however, a great deal of the benefit in that particular demo, was in also effectively applying AQM in shortening the queues, and not having that giant fifo. Enormous single queued FIFOs must die I thought then, and now, and the benefits of rfc8290 so obvious that we'd be done in a year.
Here is the next one.. be back in a bit to do fq_codel gotta jump on a call for a fewsince your eye is now "trained" for a fairly short rtt, try fremont.starlink.taht.net or london,singapore, or sydney .starlink.taht.net
we also have tests for these competing against each other, as in the usual case we are not sending flows to a single server.
SFQ will start to underperform at these longer rtts, and I don't honestly know which of cake or fq-codel will win. SFQ is doing really, really well so far, but i suspect it will go to hell on the rrul_be tests, even on the short rtt to dallas, and the way to test multiple sites is via the -H serverA -H serverB -H serverC -H serverD rtt_fair_var , which is also "interesting" on a fifo.
And with that, I really am calling it quits for the day. Very reassuring to see non-ecn work. Some backstory - ECN is not an enabled option for any but a few OSX things, or really advanced linux users, so making it misbehave, on this kernel release (and modem! can't rrul that out) coherently and consistently, rules out a ton of mild "background noise" I've had for about a year now.
OK, finally at it.. been another busy day but so far have made some great progress with cake on my brothers cable modem! 400/40 is what it's real speed appears to be. Also got wireguard setup between us so that I can SSH into a VM on his end to do my testing and rsync the data here to analyze. Will do more later now that the wife and kids are there screwing up my data with all their streamingTo restore your eyeball to what the current "real world" looks like for everyone else, try that rtt_fair test with all this fancy schmancy stuff off, just the default fifo on the modem. You situation is different than that 2013 demo in that you have a vastly shorter queue than the 250+ms queue of the cable modems of the time, and the linux tcp stack has also improved greatly (with packet pacing)....
another visual trick is putting those sites in your hosts file so you can just say -H sydney -H singapore etc on the command line instead of sydney.starlink.taht.net so it's more readable.
I should also note that the "starlink" subdomain is just the name of the linux 5.11 kernel cloud I'd created to test starlink stuff, and has nothing to do with starlink (with whom I have a non-relationship presently - amusing story of my encounter with them here: https://www.youtube.com/watch?v=c9gLo6Xrwgw starlink data here: https://docs.google.com/document/d/1puR ... QKblM/edit ). I hope they fix the dishy at some point, and their router...
I have an older cloud named "apple", and an even older one, named "comcast", and I keep them running primarily so I can verify changes in host device drivers and tcp stacks over time.
I ported fq_codel to the edgerouters over a weekend ( https://gettys.wordpress.com/2017/02/02 ... fferbloat/ ). Their userbase lept all over it, wrote the backend configuration language, the gui, and a wizard, then ubnt ultimately adopted in their next version of the OS, calling it "smart queues", in reference to the "smart queue management" spec. (It's since been renamed to esq)I have since setup whatever ubiquiti's default simple queue is on their USG.. I believe it is fq_codel? He is amazed, and his facetime video is super clear. He is on a 25/5 cable modem so that made an enormous improvement for them. As he said, I can stream netflix and game at the same time now! hehe
My brother on the other hand, we just put his new router in yesterday.. I am still not totally sure what his bandwidth is provisioned at. We only had a few minutes to play with cake but I believe I got him in a decent ballpark to start. He is around 400/40. His speeds fluctuate greatly, even when setting bandwidth. Now granted, that was all testing with web tools.. so I asked him to spin me up a VM on his server over there so I can run flent..
Oh wow! That is awesome! I have an old edgerouter around here somewhere. I think the SD card or whatever in it is corrupt. I found where I can revive it.. but just havn't had the need to. I need to put that on my todo list. I remember seeing somewhere where someone was able to get OpenBSD + pf running on it. That would be pretty neat. I love pf.I ported fq_codel to the edgerouters over a weekend ( https://gettys.wordpress.com/2017/02/02 ... fferbloat/ ). Their userbase lept all over it, wrote the backend configuration language, the gui, and a wizard, then ubnt ultimately adopted in their next version of the OS, calling it "smart queues", in reference to the "smart queue management" spec. (It's since been renamed to esq)I have since setup whatever ubiquiti's default simple queue is on their USG.. I believe it is fq_codel? He is amazed, and his facetime video is super clear. He is on a 25/5 cable modem so that made an enormous improvement for them. As he said, I can stream netflix and game at the same time now! hehe
My brother on the other hand, we just put his new router in yesterday.. I am still not totally sure what his bandwidth is provisioned at. We only had a few minutes to play with cake but I believe I got him in a decent ballpark to start. He is around 400/40. His speeds fluctuate greatly, even when setting bandwidth. Now granted, that was all testing with web tools.. so I asked him to spin me up a VM on his server over there so I can run flent..
One nice fq-codel thing is that you can run multiple netflix flows at the same time and have them hold at roughly the same rate and with consistent quality in competition with other traffic.
yes, I don't trust web tests very far. thx for adopting flent.
cake 100 on download, and NOT varying the upload, it was set at 19MTo verify - presently you have cake 100mbit on the download, and were varying the upload qdisc?
And when you tested "the bare modem" both were off?
Outstanding! It is obvious that fq_codel is working as designed! I can agree with you on the 'old fashined' way of thinking. For example, I am a linux sysadmin by trade and I administer a few 100 servers spread across the US. I could see where this would be a relevant argument in the sense that OK, across town I could get 10ms RTT, but in Washington it could be say 200ms RTT... and I would not be happy if I am dropping ssh traffic to washington because I am pushing alot of data to server across town.The download component of your test looks a touch odd to me, I asked above what it was set to.
Also the --te=upload_streams parameter has no function on the rtt_fair tests, they generate one stream per -H server option.
Here's where fq-codel begins to pull ahead of SFQ in a couple respects. Your baseline RTT is about 28ms to dallas, and over 250ms to the furthest server on the list. A design goal of TCP was to have it be ultimately (after running for a while) "fair" to flows of vastly different distances, so that you could transfer data from dallas to fremont, and from dallas to sydney, simultaneously and be sure that you'd have at least some throughput at the longer RTTs. This goal, was actually inherent in why IP took over from novell's IPX, because the IPX folk hadn't thought about this hard enough
It is still "just a goal" that is not ever met, but tends to degrade fairly gracefully, as every TCP paper you read will try and express how they might converge more or less fairly, over time at different round trips.
Nowadays, more and more data is moving to the datacenter closest to you, and in the cable case, perhaps you'd be 12ms away from my server, and in the fiber case, 2ms. With a naive
design for TCP/ip the odds are good that that "local-ish" traffic would completely starve out longer distances, and indeed it can be quite unfair to more distant flows. 7 or 8x differences in throughput at 10x RTT differences are fairly common.
But! Sydney is quite possibly still a really needed destination for your traffic, so... what do you do? I'm pretty old fashioned in terms of my aims for low latency and equal throughput... and at every point, although we optimized for RTT relentlessly in the design of fq-codel, we also aimed for ultimate that "some" bandwidth that other flows could get, in codel, maybe better than 1/7x, we didn't know...
Now, with really short fifo queues, and with sfq's really short queues, tcp generally cannot get enough runway to send a BDP's worth of traffic to more distant coasts, so you see the short RTT getting 10mbits of uplink bandwidth here:
rtt_fair_var_-_sfq_dl.png
fq-codel, on the other hand, strives to give "enough" buffering for more distant sites to get a much more nearly fair share of the bandwidth.
rtt_fair_var_-_fqcodel_dl.png
The relentless drive to move CDN resources closer and closer to you is a good thing - shorter RTTs make for more responsive web traffic in particular, but my design goal for fq-codel was
to be able to connect equally to all people, near and far, and their services of all sorts, be it email, or chat, or web or voip, regardless of how distant they were.
And we didn't get 1/7th at 10x the RTT,! we knocked it out of the park, with nearly equal throughput no matter how near, or how far. (TCPs improved also).
Luckily my brothers connection is the one with the Mikrotik RB5009, same router as I am currently running here. The USG is at the boys house which I don't have anything setup as of yet to connect to remotely. Hopefully soon.cake on the edgerouter: https://community.ui.com/questions/Cake ... c755cae8a2
cake on the udm pro: https://github.com/fabianishere/udm-kernel
The whole bufferbloat project is full of hackers desperate to have low latency bandwidth and willing to go to extraordinary lengths to get better queue management running. If routerOS had had a devkit available.... :/
Since your brother is up and running, could you try the upload string of fq_codel'd tests on, with ecn enabled? That would rule out parts of that path, and my server, at least.
I think the device he has not capable of much more than 200Mbit inbound shaping, but could be wrong. The udm pro can do about 700. Also, usually I just reflash most ubnt gear to openwrt. The edgerouter X's are nice little boxes in particular, and they seem to have mostly abandoned edgeOS. VyOS is still alive and has long had smart queues in it. I have reflashed much mikrotik gear as well, but I actually rather like routerOS, and have merely been wishing for 6+ years that they'd get the 300 lines of code that fq_codel is, into it and on by default.
Ahh bofh! I love it! Heading over to The Register, havn't been over there in years, always good for a laugh!Let me tackle the download portion of the test. :rant: *nobody* for some reason, tests up and downloads and ping simultaneously, as if people just sat there, did an upload, waited, then did a download, and then did a ping. It's a really bothersome aspect of almost all the web tests today. Real traffic, from multiple people and their devices in a household or business is in both directions, all the time. Your network should degrade gracefully when there is traffic up, down, or both at the same time. While the rrul test series is patterned on bittorrent, which once upon a time ruled the world, we stilll didn't test networks for what torrent was really doing to them, in the light of some future world that had way more devices on it, more or less behaving as badly or worse than torrent did. :End of rant: See bofh for more...
Anyway, your provider's network represents a pretty good compromise of packet, not byte limits, on both sides. If you must have a FIFO, Byte fifos are better because acks eat 1/15th the space data does, and so if you have a ton of acks in one direction or another, they crowd out the data packets. Bytes are a rough proxy for time, as it takes the same amount of time to transmit 15 64 byte acks as a 1500 byte data packet. You had about, i don't remember now, 80ms worth of buffering for big packets on the down, and yes, I can do the math for the right packet limit that actually represents with the rrul test results so long as cake's ack-filter is off, pretty accurately, but try to leave that as an exercise for the reader. But anyway, on the down, this time, you have ton of acks from the up, clogging up that queue, and your download is now rate limited to 50Mbits by the upload. (if these packet limits were oversized, your upload would be limited by the download)
noqueue_dl.png
SFQ is pretty similar here, but a bit more biased towards the shorter RTT.
sfq_dn.png
(I'm assuming above you used sfq or noqueue in the inbound shaper0
Please note, that both these behaviors in either case is actually a pretty good thing, in that the user perceptible *latency* is gone, because bytes=time and your download slowed down gracefully, and your up, underbuffered. So... win, right?
Or... you could have a network capable of running at 100Mbit down, 19Mbit up, all the time, with no latency, either:
fqcodel_dl.png
despite this being better, it appears to my eye that you were running out of queue on the down due to the synchronized drops - which could be hitting a limit at the provider or... is there a 1000 packet limit or memory limit? Cake scales this correctly for you on the down, or should. fq-codel we should have ripped out the packet limit long ago....
While this is a good result... 2x better than the default, without that sync'd drop, it too would have ultimately converged nearer to equal bandwidth for all. Cubic is still too aggressive, so it would take a while....
There is only one software queue on that VM, fq_codel.. but who knows what proxmox might be doing.. I know uplink from that server is 10gbitturn off ecn on your brothers link?
I assume you have 2 or more hardware queues on the vm?
This router has 1GB of RAM.. I do not see how you can see memory usage either, only CPU usage. There are no memory limits for fq_codel, only for cake. It would be nice if Mikrotik would give access to all available options, as I saw you state at the beginning of this post they do not have the option for gso-splitting either.How much memory does this router have?
And if there's a way to, say, double the packet and memory limits on the fq_codel rtt_fair test on your home machine maybe those sync'd drops would go away. I didn't see those options in the gui... a lot of people patch down the 10000 packet limit and 32MB limit in fq_codel to something that seems saner (and is, on memory limited routers!), so I don't know what the default is for mikrotik.
How cake autoconfigures here in this scenario may also be wrong if that too shows the sync'd drops on that test. If the gui allows upping the memlimit for that, try 8M in the inbound shaper. (cake has no packet limit) Our reasoning for how we did the defaults for the memlimit option was kind of obtuse and based more on fear of running a router out of memory than getting it exactly correct for inbound.
On outbound, a packet is allocated from an appropriately sized slab, so an ack is 64 bytes + 256 bytes overhead, a data packet rounds up to 2k.
On inbound, they are allocated from a fixed size 2k per packet ring, no matter if it's an ack or not, so you waste quite a lot of memory. We do gso-splitting, which will reallocate a gso packet from up to 42 packets all in a bunch back to the "right" size, but only if gro actually gets packets to split. Openwrt also had a hack also that would start re-slabbing packets when it had memory pressure. So, on a heavy inbound ack workload we might end up using 7x more memory each than ideal, or compensated for correctly by the cake autoconfig for the memlimit.
The ecn problem disturbs me more and more.
I've had a long day, going to bed. Very nice hacking with you these past few days.
Yep, no crashing for sure and honestly for the use case we are splitting hairs at this point. =) The seat of the pants feeling on my internet as well as my brothers is GREAT and SNAPPY!!Well, that grouped bifurcation shouldn't be happening in that way. fq-codel suffers from the birthday problem where you get a hash collission sqrt(1024), so at 32 flows it's likely you'd see 2 flows colliding and getting different behavior from the rest. Cake uses a 8 way set associatve hash so you don't see that. I am going to go back to a theory that we are not seeing the right offsets into the packet header, thus the hash function is weird, the dscp handling is weird, and the ack-filter is wonky.
Among many other things that have changed since I last looked at this code was linux switched to a sipp hash from a jenkins hash, but I'm more inclined to suspect
an offload, sending stuff from one cpu to another, or something we haven't thunk of yet.
Remember how we started? At least it doesn't crash. And even being OCD in this way, it performs better than what you had before. I have not had a deep dive into this stuff since,
oh, 2017, really. I'm very interested that it hits the field, working the right way, obviously! but I've had it for the day. Have a great one!
Yeah, I noticed those interruptions or whatever is causing it as well. I am having a heck of a time with this link. I had my brother call spectrum today and verify that his modem is JUST a gateway.. else make sure it is in bridge mode. It is in fact just a gateway. It is a DOCSIS 3.1 modem with a 2.5gb port.. actually sync'd at 2.5 to the router. I forgot the router actually has a 2.5g port.thank you so much for sharing your raw flent.gz files and packet captures. So many things in this world cannot be captured by a single number, a summary plot, and while a cdf might hint at a problem, looking at a system's evolution, over time, is always helpful. The explanation for why we saw this bifurcation:
cdfequiv.png
was that there were two *really major* interruptions in service where only that flow kept going.
cdfscanbemisleading.png
Now, as to what the heck could have caused this, I don't know. I flipped through a couple others, it seems likely this doesn't happen all the time... The packet capture is really messy and I'm no longer sure which cap I'm looking at and I have meetings most of today.
Been following along and don't have too much to add other than it's been very interesting and informative to read this discussion!
Yeah, I noticed those interruptions or whatever is causing it as well. I am having a heck of a time with this link. I had my brother call spectrum today and verify that his modem is JUST a gateway.. else make sure it is in bridge mode. It is in fact just a gateway. It is a DOCSIS 3.1 modem with a 2.5gb port.. actually sync'd at 2.5 to the router. I forgot the router actually has a 2.5g port.
Anyhow, back on topic.. doing a test on it with no queue at all, it gets ~600 down and ~40 up. What really strikes me as odd is if I use fq_codel or cake, I can only get it to around ~350 tops. I can leave the downstream unlimited, and still never gets past 350. Really odd.
So, with that said, I had him plug his computer in straight to the modem to do a speed test. Granted, it is web based but upload is still ~40 and download is from 720-830 easily twice of what is going through the router.
CPU utilization never goes above 25% which now that I think about it, it is a quad core.. so that would mean it is stressing a single core. HMMM.. maybe that is the limiting factor?
Thank you for linking that! I was literally just thinking about going to the other computer to login to his router and force it to 1gb. It is interesting that he was using fasttrack as I tried re-enabling that without any change. OK, getting out of the recliner now to go test that out!Been following along and don't have too much to add other than it's been very interesting and informative to read this discussion!
Yeah, I noticed those interruptions or whatever is causing it as well. I am having a heck of a time with this link. I had my brother call spectrum today and verify that his modem is JUST a gateway.. else make sure it is in bridge mode. It is in fact just a gateway. It is a DOCSIS 3.1 modem with a 2.5gb port.. actually sync'd at 2.5 to the router. I forgot the router actually has a 2.5g port.
Anyhow, back on topic.. doing a test on it with no queue at all, it gets ~600 down and ~40 up. What really strikes me as odd is if I use fq_codel or cake, I can only get it to around ~350 tops. I can leave the downstream unlimited, and still never gets past 350. Really odd.
So, with that said, I had him plug his computer in straight to the modem to do a speed test. Granted, it is web based but upload is still ~40 and download is from 720-830 easily twice of what is going through the router.
CPU utilization never goes above 25% which now that I think about it, it is a quad core.. so that would mean it is stressing a single core. HMMM.. maybe that is the limiting factor?
Regarding your speed issue, I wonder if it's this issue that was mentioned by another user in a different topic? They have a 5009 and a 2.5Gb modem as well it looks like: viewtopic.php?t=179145#p895221
Anyway, as we stagger forward on the less-buggy fronts, a repeat of the rtt_fair test in this scenario would be nice, on handling the down better until the sync'd drops go away. (It still might be having that overall weird interruption of service, too, need more data on that...) fq_codel with increased packet limits and memlimit as one thought, cake besteffort with a memlimit 8M perhaps. Finding a way to increase the size of the rx ring, as another. Reducing the shaped bandwidth from 100Mbit down to something less....fqcodel_dl.png
despite this being better, it appears to my eye that you were running out of queue on the down due to the synchronized drops - which could be hitting a limit at the provider or... is there a 1000 packet limit or memory limit? Cake scales this correctly for you on the down, or should. fq-codel we should have ripped out the packet limit long ago....
While this is a good result... 2x better than the default, without that sync'd drop, it too would have ultimately converged nearer to equal bandwidth for all.
OK, well.. after changing the port speed last night it wouldn't come back up. So I had to wait for my brother to reset it this morning. Long story short, it didn't work right away, I had to bounce the interface a few times but finally it was showing ~800mbit raw through the router. After a lot of testing.. I am starting to wonder if fq_codel and cake are actually crashing with such high speeds and I just don't see it on my slower connection.Thank you for linking that! I was literally just thinking about going to the other computer to login to his router and force it to 1gb. It is interesting that he was using fasttrack as I tried re-enabling that without any change. OK, getting out of the recliner now to go test that out!
Been following along and don't have too much to add other than it's been very interesting and informative to read this discussion!
Regarding your speed issue, I wonder if it's this issue that was mentioned by another user in a different topic? They have a 5009 and a 2.5Gb modem as well it looks like: viewtopic.php?t=179145#p895221
Watching the port bandwidth graph this morning while testing my brothers 1gb cable.. you can most definitely see the bursts!! I remember now why I hated my old cable modem and love my DSL now. Way less bandwidth, but so much 'cleaner'Without ack filtering it is extremely difficult to achieve full download speeds at a 15x1 ratio of down to up or worse.
Also rx rings need to be properly sized, as docsis is bursty. A rx ring of 256 is too small. Don't know if you can change that.
i wish more folk were taking packet captures of their network behaviors, using test tools like flent, or at least iperf, rather than web traffic. I also wish I still had my lab setup, and a budget to test this stuff. It's not so much debugging "cake" as suspecting there are other problems in the stack, on this model.
Alright, I gotta get back to work for a bit.. and then come back and figured out where we left off More tests coming up in a bit!Anyway, as we stagger forward on the less-buggy fronts, a repeat of the rtt_fair test in this scenario would be nice, on handling the down better until the sync'd drops go away. (It still might be having that overall weird interruption of service, too, need more data on that...) fq_codel with increased packet limits and memlimit as one thought, cake besteffort with a memlimit 8M perhaps. Finding a way to increase the size of the rx ring, as another. Reducing the shaped bandwidth from 100Mbit down to something less....fqcodel_dl.png
despite this being better, it appears to my eye that you were running out of queue on the down due to the synchronized drops - which could be hitting a limit at the provider or... is there a 1000 packet limit or memory limit? Cake scales this correctly for you on the down, or should. fq-codel we should have ripped out the packet limit long ago....
While this is a good result... 2x better than the default, without that sync'd drop, it too would have ultimately converged nearer to equal bandwidth for all.
OK, work is not bad! WOO! =)looks like perfection to me.
rtt_fair?
BWHAH gamma radiation! Ahh, I did not notice them sync'd. Interesting!see how the drops are sync'd on the down? Shouldn't happen. Up the memlimit? or it's the rx-ring. Or gamma radiation from Mars.
Limit the memory consumed by Cake to LIMIT bytes. By default, the limit is calculated based on the bandwidth and RTT settings.
Ahhh, so correct me if I am wrong.. I am a little slow on the uptake sometimes.. post lunch time sleepy.. haha so generically speaking, upping the memory limit is like increasing the ring buffer?Moah! Moah! 8x more! You have the memory to burn. (when we developed cake, *32MB* of ram in the router was a lot)
I tried to explain the "default" calculation had some overheads in it that didn't make as much sense on inbound shaping as out. I can try to explain that better....
Yes sir, 200 on both. Well that is a bummer! I was getting excited! haha but honestly.. the internet is so snappy now.. everything just instantly appears.. almost before I click the button on the mouse!so this last one had 200MB on ingress? Dang. I gotta point at available queue space at the provider, or a limited rx ring, (or that bug with bursty failures) to explain a failure to improve here.
OK, it does not appear from digging through the interface or through the documentation that I can change the ring buffer. Now, if I was running a CHR I could.. since it is RouterOS running on top of linux. This might be something for me to check out in the future. I have a spare SFP+ port on my server, I should be able to pass that through to a VM and run CHR.I imagine routerOS has no way to see or increase the rx ring? Linux uses "ethtool" to see that.
Coming back to this post.. I never knew cubic drops 30%, heck I never looked into it.. I always assumed it was 50% I guess when I learned about it at the time, I must have been using reno and reading about that?The behavior of multiple queues in series is kind of complex. Theorists like very much to think about things in terms of a fountain of water, but the real world is batchy in so many respects.
Take packets hitting the rx ring. A batch arrives and the ring was nearly full in the first place. A whole bunch of packets (from all sources) get dropped. The cpu arrives to "clean" the rx ring, never sees that, and then tosses the result into the aqm which then tries to fair queue and intelligently drop if it too is overloaded, hopefully desynchronized drops that "fill in" the spaces within the other competing sawtooths. But they end up pretty synchronized when the rx ring overflows and thus the closest hop retains the most bandwidth, as tcp's defined response to multiple drops within a single RTT is to drop the rate, once. (We are now in TCP/IP 401 classes, rather than my usual 101)
The Cubic tcp algorithm only drops the rate by 30% ( which I've long disagreed with ) and then works towards recovering using a cubic function (which is clever), tcp reno uses 50% and climbs back additively, which means other flows can grab more bandwidth faster, but a reno flow gets less bandwidth. (I think you can tell flent to use another algo via --te=cc_algo=reno,reno or --te=CC=reno,reno but I'd have to re-read the codebase). BBR's methods are very different, as you saw. I don't think I have BBR enabled on all the servers under test, I'd have to check.
This scenario is even worse than that in that the ISP has a buffer at their end, the modem, also, and either one of those unable to absorb a burst will drop packets.
Over the last 20 years, the internet got redesigned for speedtest.net, with everyone testing X flows at a time, up, then, down, then ping, all to the same server.
Oh, definitely not celebrating yet! Just happy to start getting some fairly consistant results and I move knobs one way or the other! Yes, I need to slow down, I meant to say cake memlimit!don't celebrate too soon. And you mean cake memlimit or physical memory?
Is the ack-filter on on egress? Again, given my still held doubts on having the offsets right for dscp, ecn, and that, having it on may do bad things, but it's very useful on asymmetric connections if working. https://blog.cerowrt.org/post/ack_filtering/
Lastly you posted nice plots saying "default" when I think you meant the hw multi-queue? It was good to see it converge at t+40. Yes you really do want to spread more load across cores if possible.
Can do! Do you want me to use fq_codel or cake?I'd appreciate another capture from your brothers box, of rtt_fair, blowing up, with ecn enabled.
also it's easier to look at this stuff in tcptrace/xplot if you just capture those flows.
tcpdump -i the_interface -s 128 -w the_capture host dallas or host sydney or host ...
thx!
No problem! Mexico would be nice this time of year! I have some friends that live along the border, I should go visit for a BBQ and a few cervesas!Thx again for helping. Trying to decide on fleeing to mexico or not. Ok, please reload the qdisc(s), leave ecn off, and try again?
Were you seeing these lumps before?
lumps.png
You got no throughput from dallas with ecn.
nodnfromdallas.png
AHHA!! I see what ya did there. It didn't grow this time. I set it to 15mbit on upload.. I had been assuming all along that since wide open it gets 40mbit upload on his link, I was working around that range. My assumption is that I am used to the feel of my DSL setup.. and cable does things diferent? No science behind that statement but that looks way better than anything at 40mbit or even a high percentage of that. Now I have more testing to figure out where the happy place is on the upload bandwidth!Well, you shouldn't see that long term growth pattern either. This is after you tuned up the multipath tx/rx thing? What happens with bandwidth down less 20Mbit?
Anyway, thx again. I'm packing up for a trip south, (not to mexico! trying to get closer to the spacex launch), and can't look at this harder today.
When the network is more idle a reboot, putting in--step-size=0.05 -l 300 with ecn off, with cake, with fq_codel, but ya know, feel free to stop fixing the internet with me, and spend time with family, or shopping?
Yep, they went down for most of the day yesterday.. it appears they restored a backup of the forum =(it looks like mikrotik has lost some data.
/queue export compact
# dec/29/2021 13:24:14 by RouterOS 7.1.1
# ...
# model = RBD52G-5HacD2HnD
/queue type
add kind=fq-codel name=fq_codel
/queue simple
add bucket-size=0.005/0.005 max-limit=100M/40M name=internal_qos queue=fq_codel/fq_codel target=ether1 total-queue=fq_codel
/queue type
add cake-atm=ptm cake-diffserv=besteffort cake-mpu=88 cake-overhead=40 kind=cake name=cake-default
add cake-ack-filter=filter cake-atm=ptm cake-bandwidth=22.0Mbps cake-diffserv=besteffort cake-mpu=88 cake-nat=yes cake-overhead=40 kind=cake name=cake-up
add cake-atm=ptm cake-bandwidth=104.0Mbps cake-diffserv=besteffort cake-mpu=88 cake-nat=yes cake-overhead=40 cake-wash=yes kind=cake name=cake-down
/queue simple
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1-WAN total-queue=cake-default
Thanks mate - not a slow reply at all! Mine is syncing 106/41 so I'll throw that in there for now, go for a few days, then see how it is.I have 100/20 VDSL2.. and this setup has been working like a dream! (The speeds are set to the sync rate in the modem 104/22.. YMMV)
/queue type
add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default
add cake-ack-filter=filter cake-bandwidth=45.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up
add cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down
/queue simple
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
Funny thing is you will get same results with any queue type, try sfq for example instead cake..RB5009 arrived. Here's some brief testing of cake.
ISP: Aussie Broadband
Technology: Fibre To The Premise (FTTP)
Down/Up: 1000M/50M
Waveform Results:Code: Select all/queue type add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default add cake-ack-filter=filter cake-bandwidth=45.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up add cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down /queue simple add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
Before: https://www.waveform.com/tools/bufferbl ... 7575df8878
After: https://www.waveform.com/tools/bufferbl ... db297e528f
Nice! Now run some flent tests!RB5009 arrived. Here's some brief testing of cake.
ISP: Aussie Broadband
Technology: Fibre To The Premise (FTTP)
Down/Up: 1000M/50M
Waveform Results:Code: Select all/queue type add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default add cake-ack-filter=filter cake-bandwidth=45.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up add cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down /queue simple add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
Before: https://www.waveform.com/tools/bufferbl ... 7575df8878
After: https://www.waveform.com/tools/bufferbl ... db297e528f
Thanks for forcing me to do this - perhaps I'm going back to the drawing board. Cake seems to make my upload nice and consistent, but download and latency is still all over the shop.
Nice! Now run some flent tests!
What is strange is that the resource monitor in the router would suggest it's perfectly fine doing this (40-60% util on all cores), but the numbers don't lie. You're correct I was going to test on a 100/40 link (my in-laws'), but my new RB5009 arrived at my own house and so I wanted to see what it could do on 1000/50 as well. The experience I gain from this exercise will give me the ability to set it up on the in-laws' later on.My guess is you are thoroughly out of CPU on the download, not being able to crack 400Mbit.
# Enable fasttrack-connection only on inbound = WAN to exclude download from SQM
/ip firewall filter
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related hw-offload=yes in-interface-list=WAN
/queue type
add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default
add cake-ack-filter=filter cake-bandwidth=40.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up
add cake-bandwidth=1000.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down
/queue simple
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
tc qdisc add dev ether1 root cake 1000Mbps besteffort nat
tc qdisc add dev ether1 root cake 1000Mbps besteffort nat ingress
https://wiki.mikrotik.com/wiki/Manual:H ... _AlgorithmI have no idea what this does - what does the bucket-size thing do?
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
I see no such issue, I can set my CAKE queue to cake-bandwidth=50M, set that queue type as download on simple queue and it works properly, limiting speed to 50M.In my experimentation, I can only ever observe the rate limiting occurring correctly when cake is set as an interface queue but there's no way to set an asymmetric bandwidth limit in that configuration
/queue type
add cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-mpu=64 cake-nat=yes cake-overhead=22 cake-overhead-scheme=ether-vlan,via-ethernet,docsis cake-rtt-scheme=internet kind=cake name=cake-docsis@download,unlimited
add cake-bandwidth=50.0Mbps cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-mpu=64 cake-nat=yes cake-overhead=22 cake-overhead-scheme=ether-vlan,via-ethernet,docsis cake-rtt-scheme=internet kind=cake name=cake-docsis@download,50M
add cake-ack-filter=filter cake-bandwidth=40.0Mbps cake-diffserv=besteffort cake-flowmode=dual-srchost cake-mpu=64 cake-nat=yes cake-overhead=22 cake-overhead-scheme=ether-vlan,via-ethernet,docsis cake-rtt-scheme=internet kind=cake name=cake-docsis@upload,40M
/queue simple
add bucket-size=0.1/0.2 dst=ether1_wan max-limit=40M/700M name=wan queue=cake-docsis@upload,40M/cake-docsis@50M,unlimited target="" total-queue=default
add bucket-size=0.005/0.005 name=priority packet-marks=icmp,dns,syn,http-init,sip parent=wan priority=1/1 target=""
add bucket-size=0.05/0.1 name=untracked packet-marks=no-mark parent=wan queue=cake-docsis@upload,40M/cake-docsis@download,50M target="" total-queue=default
/queue type
add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default
add cake-ack-filter=aggressive cake-bandwidth=40.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up
add cake-bandwidth=900.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down
/queue simple
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
/queue type
add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default
add cake-ack-filter=aggressive cake-bandwidth=45.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up
add cake-bandwidth=945.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-down
/queue simple
add bucket-size=0/0 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
I feel compelledWith the above result, yes, I think y'all have compelling reasons to run out and deploy fq_codel and cake everywhere you can, ASAP.
Heh, that's my thread too. Yes it's upsetting, but the Mikrotik Support team said they have replicated the problem based on my logs and look forward to a fix in a future version.I feel compelled
…but worth reminding that Simple Queues as used in some of the examples in this thread appear to break IPv6 under ROS 7.1.1 (ref viewtopic.php?t=181705)
I try your setup (just copy above config into my Chateau 5G terminal), IPV6 disabled, WAN interface LTE1 and changing values to 30Mbps DL and 5Mbps UL to adopt to my access speed.I'll call that a success for now. Now to go and tackle the 100/40 connection down the road.Code: Select all/queue type add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default add cake-ack-filter=aggressive cake-bandwidth=45.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up add cake-bandwidth=945.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-down /queue simple add bucket-size=0/0 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
Do have a followup on this one...
The other question from Bithaulersany tips for LTE connections? Especially ones that go from ~5Mbps to 70Mbps in a few hours?
is also very valid, same problem again on my side. LTE (and soon 5G even worse) is a medium where in 24h the "pipe" itself changes heavily.
In this situation it is really hard to define the pipe size and do queueing with fixed values gets almost impossible.
What can CAKE do in this case?
1) Can someone confirm that 7.2 perhaps has a working ipv6?
The issue reported was not about IPv6 and cake specifically. It was about IPv6 not working when there was a simple queue (of any type) used with an interface as the "target". Cake works fine with IPv6 with queue trees and interface queues even on 7.1.1, but not with simple queues. My understanding is that 7.2rc3 is no different from 7.1.1 in this way, but I haven't tried it myself to confirm. Your post doesn't make clear whether you tried this with a simple queue that used an interface as the target - if you haven't, then you haven't actually verified whether or not this specific problem is resolved.I can confirm that IPv6 and Cake are working on 7.2rc3.
The issue reported was not about IPv6 and cake specifically. It was about IPv6 not working when there was a simple queue (of any type) used with an interface as the "target". Cake works fine with IPv6 with queue trees and interface queues even on 7.1.1, but not with simple queues. My understanding is that 7.2rc3 is no different from 7.1.1 in this way, but I haven't tried it myself to confirm. Your post doesn't make clear whether you tried this with a simple queue that used an interface as the target - if you haven't, then you haven't actually verified whether or not this specific problem is resolved.
/queue type
add cake-bandwidth=1700.0Mbps kind=cake name=aqm-cake
/queue simple
add name=queue1 queue=aqm-cake/aqm-cake target=vlan3200
It might be fixed then - is that with connection tracking? i.e. do you have an IPv6 allow established,related firewall rule that is working correctly with that queue in place?That was the test I performed. IPv6 + simple queue using the interface as a target works on 7.2rc3 and CCR2116
As an end-user here, I have some cake-related questions for you;* Cake tries really hard to follow a bunch of mutually conflicting diffserv RFCs, and in an age where videoconferencing is very important the cake diffserv4 model is closer to how a wifi AP treats it. see: https://www.w3.org/TR/webrtc-priority/ for this underused facility in webrtc.
It's not fixed. RB5009 upgraded from 7.1.3 to 7.2rc4 after reading this message.The issue reported was not about IPv6 and cake specifically. It was about IPv6 not working when there was a simple queue (of any type) used with an interface as the "target". Cake works fine with IPv6 with queue trees and interface queues even on 7.1.1, but not with simple queues. My understanding is that 7.2rc3 is no different from 7.1.1 in this way, but I haven't tried it myself to confirm. Your post doesn't make clear whether you tried this with a simple queue that used an interface as the target - if you haven't, then you haven't actually verified whether or not this specific problem is resolved.
That was the test I performed. IPv6 + simple queue using the interface as a target works on 7.2rc3 and CCR2116
Code: Select all/queue type add cake-bandwidth=1700.0Mbps kind=cake name=aqm-cake /queue simple add name=queue1 queue=aqm-cake/aqm-cake target=vlan3200
"Most people can just put 44 and it will work for you regardless of what your underlying technology is. For some people with fiber or cable connections, this may waste up to around 1-2% of your bandwidth, but it will bias you towards having lower bufferbloat rather than higher which is usually a good thing. The primary use case for more precise tuning is when your internet speed is relatively low (less than 5Mbps) and / or you have more than 20% of your internet speed in either direction taken up by small-packet traffic such as VOIP or gaming. If you have more than 5Mbps and/or you are not running a call center, further adjustment is probably not worth the effort and you should spend more time trying to better measure your reliable level of internet speed itself."
Either way setting these values seems valuable for VOIP."I guess what I'm arguing is that exactly right is not needed, within 10% of the right value is probably just fine for all but a VOIP call center running hundreds of calls on a tight 10-15Mbps symmetric line. The reason it's needed is to calculate the true packet size. If packet payload size is 1500 bytes then +-45 bytes makes only ~3% error. If packet size is 150 bytes like in a VoIP call, then 45 bytes is 33% error! So having the overhead included is important, but the difference between overhead say 44 and overhead 48 is 4 bytes and 4 bytes on 150 bytes is now back to 3% error so, do you know the capacity of your line to within 3% error? If not then even if you're a VoIP call center the error in your overhead if you just say 44 bytes is offset by the fact you aren't sure if you should put 10000 kbps or 9700kbps... If you're not a call center it's even an order of magnitude less important..."
Mke I'd love to get the following info off you to compare configs:Like blurrybird I am on Aussie Broadband in OZ
Speedtest by Ookla
Server: Winn Telecom - Mount Pleasant, MI (id = 1062)
ISP: Spectrum
Latency: 26.63 ms (10.78 ms jitter)
Download: 114.17 Mbps (data used: 137.9 MB )
Upload: 10.93 Mbps (data used: 5.5 MB )
Packet Loss: 0.0%
/queue/export
# apr/09/2022 11:31:30 by RouterOS 7.2
# software id = V13P-7JPC
#
# model = RB5009UG+S+
# serial number = EC1A0E402D35
/queue type
add cake-bandwidth=10.0Mbps cake-diffserv=diffserv4 cake-memlimit=32.0MiB \
cake-mpu=64 cake-nat=yes cake-overhead=18 cake-overhead-scheme=docsis kind=\
cake name=cake-up
add cake-bandwidth=105.0Mbps cake-diffserv=diffserv4 cake-memlimit=32.0MiB \
cake-mpu=64 cake-nat=yes cake-overhead=18 cake-overhead-scheme=docsis kind=\
cake name=cake-down
/queue simple
add name=Spectrum queue=cake-up/cake-down target=\
192.168.88.0/24,2600:6c4a:5a00:56a::/64,192.168.5.0/24
Speedtest by Ookla
Server: CMS Internet - Mount Pleasant, MI (id = 735)
ISP: Spectrum
Latency: 27.12 ms (6.73 ms jitter)
Download: 91.94 Mbps (data used: 120.2 MB )
Upload: 9.45 Mbps (data used: 8.2 MB )
Packet Loss: 0.0%
tc qdisc show dev eth0
qdisc mq 0: root
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
flent rrul -p all_scaled -l 300 -H dallas.starlink.taht.net --step-size=.05 -t cake-spectrum-rb5009-rrul-300 -o cake-spectrum-rb5009-rrul-300.png
flent rtt_fair -p all_scaled -l 300 -H dallas.starlink.taht.net -H fremont.starlink.taht.net -H london.starlink.taht.net -H singapore.starlink.taht.net -H sydney.starlink.taht.net --step-size=.05 -t cake-spectrum-rb5009-rttfair-300 -o cake-spectrum-rb5009-rttfair-300.png
flent rtt_fair_var -p all_scaled -l 300 -H dallas.starlink.taht.net -H fremont.starlink.taht.net -H london.starlink.taht.net -H singapore.starlink.taht.net -H sydney.starlink.taht.net --step-size=.05 -t cake-spectrum-rb5009-rttfairvar-300 -o cake-spectrum-rb5009-rttfairvar-300.png
flent tcp_ndown -p ping -l 300 -H dallas.starlink.taht.net --step-size=.05 -t cake-spectrum-rb5009-tcpndown-300 -o cake-spectrum-rb5009-tcpndown-300.png --te=download_streams=4 --te=ping_hosts=8.8.8.8
flent tcp_nup -p ping -l 300 -H dallas.starlink.taht.net --step-size=.05 -t cake-spectrum-rb5009-tcpnup-300 -o cake-spectrum-rb5009-tcpnup-300.png --te=upload_streams=4 --te=ping_hosts=8.8.8.8
You can benefit from the ack-filter on the up to some extent, but I'm pretty sure you would prefer the sqm'd result to the non-sqm'd.I didn't run into anything like that during my testing, although it's possible that I wasn't putting enough load on the router to cause a problem like that to occur. My router is also a different model with a different architecture than the ones mentioned there. I'll chime in on the thread you linked to keep this one a little cleaner, thanks!
/queue type
add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default1
add cake-ack-filter=filter cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up
add cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down
/queue simple
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
I experience the same phenomenon on an RB5009 regardless if I use cake or fq_codel. Bandwidth on a Gigabit WAN link is roughly cut in half. I'm guessing it's a CPU constraint ?I'm using the below on a symmetrical 1Gb/1Gb connection but it reduces the over upload and download to 500Mb and I've checked CPU usage which is around 58% on an RB4011 any ideas what I'm doing wrong.
Same experience. Download speed is ok but the upload speed is cut to almost a half.I'm using the below on a symmetrical 1Gb/1Gb connection but it reduces the over upload and download to 500Mb and I've checked CPU usage which is around 58% on an RB4011 any ideas what I'm doing wrong.
Code: Select all/queue type add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default1 add cake-ack-filter=filter cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up add cake-bandwidth=950.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down /queue simple add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
/queue type
add cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-default
add cake-ack-filter=filter cake-bandwidth=18.0Mbps cake-diffserv=besteffort cake-nat=yes kind=cake name=cake-up
add cake-bandwidth=47.0Mbps cake-diffserv=besteffort cake-nat=yes cake-wash=yes kind=cake name=cake-down
/queue simple
add bucket-size=0.001/0.001 name=cake queue=cake-down/cake-up target=ether1 total-queue=cake-default
Then maybe you should read the latest patch notes?What are you talking about? It's been working over here just fine?
A developer is me. Things were looking good.
They are talking about that Mikrotik decided to limit cake to interface queues only in the latest 7.3beta40 release. Release notes buried here: viewtopic.php?t=185066#p932950What are you talking about? It's been working over here just fine?
A developer is me. Things were looking good.
add bucket-size=0.001/0.001 max-limit=72M/18M name="Cake - Smaller Bucket" queue=default-cake/default-cake target=pppoe-out1 total-queue=default-cake
add cake-atm=ptm cake-diffserv=diffserv4 cake-memlimit=32.0MiB cake-nat=yes cake-overhead=30 cake-overhead-scheme=pppoe-ptm kind=cake name=default-cake
Cake is interface queue only.Hi!
Can somebody take a look at my CAKE config and tell me if there's anything I can do to get as close to line-speed as possible. Some background:
Zen VDSL2 connection (80/20) into Vigor 130 modem (VLAN 101); PPPoE connection established via RB4011 on ether1. Jumbo packets enabled (ether1 MTU 1508; PPPoE MTU 1500)
Code: Select alladd bucket-size=0.001/0.001 max-limit=72M/18M name="Cake - Smaller Bucket" queue=default-cake/default-cake target=pppoe-out1 total-queue=default-cake add cake-atm=ptm cake-diffserv=diffserv4 cake-memlimit=32.0MiB cake-nat=yes cake-overhead=30 cake-overhead-scheme=pppoe-ptm kind=cake name=default-cake
Line speed according to the Vigor130 is 78M/20M; so the more I can eek out to get towards this speed would be a bonus; but otherwise my question would be around whether the correct overhead or mpu is set (PPPoE connection; VLAN101 however this is currently set modemside and jumbo packets router-end)
Thanks
Do you mean that the interface needs to be set to eth1 (as opposed to the pppoe-interface) or rather the upcoming change in ROS 7.3 that does not allow cake as a simple queue type?Cake is interface queue only.
Mikrotik couldn't fix the bug so they cut the feature.Now cake in 7.3beta40 is useless.Do you mean that the interface needs to be set to eth1 (as opposed to the pppoe-interface) or rather the upcoming change in ROS 7.3 that does not allow cake as a simple queue type?Cake is interface queue only.
they did reach out to me, and toke and I both replied, but they haven't got back to us.Mikrotik couldn't fix the bug so they cut the feature.Now cake in 7.3beta40 is useless.
Do you mean that the interface needs to be set to eth1 (as opposed to the pppoe-interface) or rather the upcoming change in ROS 7.3 that does not allow cake as a simple queue type?
Cake is some sort of voodoo magic! Uploads are faster, downloads are more consistent, latency is lower under load??
With no queueing:
With CAKE:
Check out Flent! If you're running Windows then you'll need to setup a linux machine in order to use it. If it's a newer Windows computer then I'd consider setting up the Windows Subsystem for Linux to get you going quickly.Cake is some sort of voodoo magic! Uploads are faster, downloads are more consistent, latency is lower under load??
With no queueing:
With CAKE:
how can i do that kind of bandwidth/latency tests with that result graphs ??
Hi Dave! Any news on this front? I'm so confused by their reasoning.they did reach out to me, and toke and I both replied, but they haven't got back to us.
Mikrotik couldn't fix the bug so they cut the feature.Now cake in 7.3beta40 is useless.
/queue type
add name=cake-WAN-tx kind=cake cake-diffserv=diffserv3 cake-flowmode=dual-srchost cake-nat=yes
add name=cake-WAN-rx kind=cake cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-nat=yes
/queue simple
add max-limit=500M/100M name=queue1 queue=cake-WAN-rx/cake-WAN-tx target=wan-pppoe1
/queue type
add cake-ack-filter=filter cake-diffserv=diffserv4 cake-flowmode=dual-srchost \
cake-memlimit=32.0MiB cake-mpu=84 cake-nat=yes cake-overhead=38 \
cake-overhead-scheme=ethernet cake-rtt-scheme=internet kind=cake name=\
cake_UL
add cake-diffserv=diffserv4 cake-flowmode=dual-dsthost cake-memlimit=32.0MiB \
cake-mpu=84 cake-nat=yes cake-overhead=38 cake-overhead-scheme=ethernet \
cake-rtt-scheme=internet cake-wash=yes kind=cake name=cake_DL
/queue simple
add dst=ether1-Internet name=queue1 queue=cake_UL/cake_DL target=""
Thank you for the explanation... I'm on cable, with a modem I have zero control over... suspect that is the culprit.Powersave is often a problem. A device will go to sleep until there are more packets to transmit. This is a somewhat foolish behavior network-wise, in that - for example - a tcp syn then syn/ack packet outstanding needs all the boost it can get to get more packets in flight once the flow gets going.
One string of cable modems would sleep stupidly this way. Many of our devices will buffer up small numbers packets over a small interval and only release them after a ms or 4, to save on cpu context switches, also.
In other cases you can get inside the request/grant loop that some gpon and some cable has. The underlying hw makes a request for a slot ahead of time based on an estimate of what it will need in the next cycle from the previous, thus overlapping requests. cable has a 2-6ms request/grant cycle.
Your target has no data, and simple queuing does not take effect.Disclaimer - I'm barely a dabbler when it comes to this stuff and I'm still not comfortable with some of the jargon and abbreviations. But I'm trying.
I have a RB760iGS (hEX S) (256MB RAM) that is my office's gateway/firewall/router. Our internet is provided via a WISP - they're using Ubiquiti equipment. Our connection is 25M/5M - it tests out a little less.
I just upgraded the router to 7.3rc1 and implemented the following based on some examples I've seen:The cake-memlimit - I've seen suggestions that 32M is a good number to start with and this router certainly has plenty available. Question - what's the default value?Code: Select all/queue type add cake-ack-filter=filter cake-diffserv=diffserv4 cake-flowmode=dual-srchost \ cake-memlimit=32.0MiB cake-mpu=84 cake-nat=yes cake-overhead=38 \ cake-overhead-scheme=ethernet cake-rtt-scheme=internet kind=cake name=\ cake_UL add cake-diffserv=diffserv4 cake-flowmode=dual-dsthost cake-memlimit=32.0MiB \ cake-mpu=84 cake-nat=yes cake-overhead=38 cake-overhead-scheme=ethernet \ cake-rtt-scheme=internet cake-wash=yes kind=cake name=cake_DL /queue simple add dst=ether1-Internet name=queue1 queue=cake_UL/cake_DL target=""
The interface queues are all "only-hardware". I also disabled the fasttrack I had in my forward filters. Is this all that's needed for me to get started with this? What information can I provide to assist with validating performance? Running the Waveform bufferbloat test gives me an A+. However - I also get that A+ with the queue disabled.
Am I correct that with the queue enabled cake is supposed to automagically implement qos without my needing to mark packets in mangle?
/queue simple
add limit-at=940M/143M max-limit=950M/146M name=CAKE queue=cake-down/cake-up \
target=pppoe-out1
My "target" is should have been set (ether1) - don't know why it didn't show up in the command line (it was set via Winbox GUI). I've now set both "dst" and "target" to ether1.Your target has no data, and simple queuing does not take effect.
/queue simple
add limit-at=940M/143M max-limit=950M/146M name=CAKE queue=cake-down/cake-up \
target=pppoe-out1
It doesn't need one. But a simple queue needs a target interface or IP set before you get to CAKE. And, if you know the speed of the link you are inserting CAKE (or any queue) on is fixed speed, then you should use that in CAKE. All queues benefit from having a theoretical/ideal max speed as basis.I'm assuming the limit numbers you're showing are for your own connection - as I stated mine is 25M/5M. I left the limits out as I *thought* cake would auto-configure/adapt without explicit limits set.
I really hope they understand this now.To avoid further flooding of the 7.30beta thread with Cake topics, here some results taken from my home network:
RB5009, ROS 7.2.2, Fiber uplink at SFP1 using PPPoE with NAT capped at nominal 500/100 by the ISP equipment at the other end of the fiber.
The ISP UL shaper does a not so bad job, but the DL shaper is awful as visible in the plot without queue.
ROS simple queue setup targeting the PPPoE uplink interfaceCode: Select all/queue type add name=cake-WAN-tx kind=cake cake-diffserv=diffserv3 cake-flowmode=dual-srchost cake-nat=yes add name=cake-WAN-rx kind=cake cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-nat=yes /queue simple add max-limit=500M/100M name=queue1 queue=cake-WAN-rx/cake-WAN-tx target=wan-pppoe1
This gives very good results in the flent rrul test. Almost no latency increase and both DL/UL are running at nominal speed, saturated by the 4 parallel connections.
flentres_cake.png
The same test with the simple queue on wan-pppoe1 disabled shows high buffer bloat >100ms. The latency under load increases by a factor of 10.
The total DL is about 1/2 of the line rate, because the 4 parallel connections are fighting each other.
flentres_noqueue.png
Regarding how good it works, it would really be interesting to hear what exact reasons MT has to disallow use cases such as above with the latest 7.3 beta.
Seems to be egress only.I have lost track... can an interface queue also shape inbound?
/queue type
add fq-codel-limit=1000 fq-codel-quantum=300 fq-codel-target=12ms kind=fq-codel name=fq-codel
/queue simple
add max-limit=118M/11M name=fq-codel queue=fq-codel/fq-codel target=ether1
/queue type
add cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-up
add cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-down
/queue simple
add max-limit=118M/11M name=cake queue=cake-down/cake-up target=ether1
/queue type
add fq-codel-limit=1000 fq-codel-quantum=300 fq-codel-target=12ms kind=fq-codel name=fq-codel
/queue tree
add bucket-size=0.01 max-limit=118M name=download packet-mark=no-mark parent=bridge1 queue=fq-codel
add bucket-size=0.01 max-limit=11M name=upload packet-mark=no-mark parent=ether1 queue=fq-codel
/queue type
add cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-up
add cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-down
/queue tree
add bucket-size=0.01 max-limit=118M name=download packet-mark=no-mark parent=bridge1 queue=cake-down
add bucket-size=0.01 max-limit=11M name=upload packet-mark=no-mark parent=ether1 queue=cake-up
I'm not sure what happened there. Even though I chose a quiet time on the home network, I thought some phone or device started a backup or something. But if you say latency would not have been affected, then I don't know. I wasn't touching anything during the test.0) something weird happened on "Cake, simple queue configuration, fasttrack disabled." - did you reset the qdisc? A typical "hit" from some other flow on the link affects throughput, not latency....
Would there be a benefit in putting fq_codel on the physical interface instead of default "only-hardware-queue" and run that along with cake in a simple queue? The configuration certainly allows it.(ideally mikrotik) to obsolete the default on interface pfifo AND sfq in favor of fq_codel
What would be the real-world implications in differentiating between dscp types vs not? If I understand correctly, it would improve certain latency-sensitive traffic even further.If you don't want to differentiate between dscp types, use cake besteffort (which saves on cpu)
/queue simple
add max-limit=20M/5M name=cake-50 queue=cake-ingress/cake-egress target=lte1
/queue type
add cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-ingress
add cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-egress
/ip firewall filter
add action=accept chain=forward comment="queue-cake - upload" \
connection-state=established,related out-interface=lte1 \
src-address=192.168.0.0/24
add action=accept chain=forward comment="queue-cake - download" \
connection-state=established,related dst-address=\
192.168.0.0/24 in-interface=lte1
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
connection-state=established,related hw-offload=yes
Agree, but as someone having to do with Austrian A1 professionally, I would not recommend to wait for them either. I think, when every other ISP in the world will have solved the bufferbloat issue, they might follow... The same goes for almost all other Central European big providers, unfortunately. IPv6 adoption is the other painful topic...So while your, single, coherent complaint might seem like a drop in the bucket, a futile waste of time, I've been at this for 12 years now, and there are now billions and billions of machines that are behaving better for all of us, sticking at it, and sticking it to the man.
"There is no try, only, do" - Yoda.
Hey Hi,started using cake now. its working quite good in the simple queue on my WAN interface. except when Im downloading file from tcp/443, websites slow down like image load time etc. this doesn't happen on watching video or browsing on other devices though.
my guess cake cant separate http traffic load on same device.. is there any setting that can fix this issue?
edit: setting my QT's PCQ to both src and dst address did the separation for different http sources for same device.
for cake qos, I read the manual on the internet but still dont know if I should use wash or ack filter and which diffserv to use.. can someone help me on this?
edit2: now I set the DSCPs in my mangle rules, all of them respectfully to their type and read about diffserv, wash, ack filter. it is even better now.
besteffort does not attempt any differentiation between diffserv classes. It is equivalent to fq_codel in this mode, except it uses an 8-way set associative method to (nearly) garuntee each flow it's own queue, and the default triple-isolate mode tries to share fairly, also, between hosts.HI,
could please any one simply expain the diffrence between bestefford, diserv4?
I submitted this very request in late September. (SUP-92043) MikroTik responded with ROS v7.7alpha90 for testing. Based on my test involving demotion of MS Delivery Optimization connections to LE, Cake was responding to LE (codepoint 000001) packets correctly. This was before v7.6 went stable, and based on the change notes, I suspect v7.6 also has it right.Hi Dave (dtaht),
> A modern version of cake has support for the new diffserv LE codepoint. I'd dearly like support for that in mikrotik given how problematic CS1 proved to be, and it's a teeny patch
+1! Would be great if you could submit a request at https://help.mikrotik.com so it is formalized.
Thanks,
Dan
Yay! Thx!I submitted this very request in late September. (SUP-92043) MikroTik responded with ROS v7.7alpha90 for testing. Based on my test involving demotion of MS Delivery Optimization connections to LE, Cake was responding to LE (codepoint 000001) packets correctly. This was before v7.6 went stable, and based on the change notes, I suspect v7.6 also has it right.Hi Dave (dtaht),
> A modern version of cake has support for the new diffserv LE codepoint. I'd dearly like support for that in mikrotik given how problematic CS1 proved to be, and it's a teeny patch
+1! Would be great if you could submit a request at https://help.mikrotik.com so it is formalized.
Thanks,
Dan
/queue type
add cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-memlimit=32.0MiB cake-rtt=60ms cake-overhead-scheme=ethernet cake-nat=no kind=cake name=cake_rx
add cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-memlimit=32.0MiB cake-rtt=60ms cake-overhead-scheme=ethernet cake-nat=yes kind=cake cake-ack-filter=filter name=cake_tx
/queue tree
add comment="qosconf: download queue with cake" bucket-size=0.05 max-limit=500M name=cake_download packet-mark=no-mark parent=bridge1 queue=cake_rx
add comment="qosconf: upload queue with cake" bucket-size=0.03 max-limit=50M name=cake_upload packet-mark=no-mark parent=pppoe-out1 queue=cake_tx
DOCSIS/cable**: "overhead 18 mpu 88"
Note: The real per-packet/per-slot overhead on a DOCSIS link is considerably higher, but the DOCSIS standard mandates that user access rates are shapes as if they had 18 bytes of per-packet overhead, so for us that is the relevant value.
/queue type
add cake-diffserv=besteffort cake-memlimit=254.0MiB cake-mpu=64 cake-nat=yes cake-overhead=18 cake-overhead-scheme=docsis kind=cake name=cake-default
add cake-diffserv=besteffort cake-memlimit=64.0MiB cake-mpu=64 cake-overhead=18 cake-overhead-scheme=docsis kind=cake name=cake-up
add cake-diffserv=besteffort cake-memlimit=64.0MiB cake-mpu=64 cake-overhead=18 cake-overhead-scheme=docsis kind=cake name=cake-down
/queue simple
add bucket-size=0/0 max-limit=650M/28M name=cake queue=cake-down/cake-up target=ether1-wan total-queue=cake-default
And that's the small price you pay. Overall lower latency will help real applications (streaming, voip, games, even SSL webpages) ... more than any slightly higher test result from speedtest.net etc.The problem is max download speed never exceed 450Mbps, with cake disabled, it goes up to 600Mbps (but Download Active would easily add 100+ms)
/queue type
add cake-memlimit=64.0MiB cake-rtt=60ms kind=cake name=cake_rx
add cake-memlimit=64.0MiB cake-rtt=60ms kind=cake name=cake_tx
/queue tree
add bucket-size=0 max-limit=1550M name=cake_download packet-mark=no-mark parent=sfp-lan queue=cake_rx
add bucket-size=0 max-limit=1550M name=cake_upload packet-mark=no-mark parent=pppoe-out1 queue=cake_tx
/interface ethernet
set [ find default-name=ether1 ] disabled=yes
set [ find default-name=ether2 ] loop-protect=off mtu=1508 name=ether2-wan
set [ find default-name=ether3 ] disabled=yes loop-protect=off
set [ find default-name=ether4 ] disabled=yes
set [ find default-name=ether5 ] disabled=yes
set [ find default-name=ether6 ] disabled=yes
set [ find default-name=ether7 ] disabled=yes
set [ find default-name=ether8 ] disabled=yes
set [ find default-name=sfp-sfpplus1 ] loop-protect=off name=sfp-lan
/interface vlan
add interface=ether2-wan loop-protect=off mtu=1508 name=vlan1 vlan-id=35
/interface pppoe-client
add add-default-route=yes disabled=no interface=vlan1 keepalive-timeout=disabled name=pppoe-out1 user=xxxxxxxx@bellnet.ca
/queue type
add cake-memlimit=64.0MiB cake-rtt=60ms kind=cake name=cake_rx
add cake-memlimit=64.0MiB cake-rtt=60ms kind=cake name=cake_tx
/queue tree
add bucket-size=0 max-limit=1550M name=cake_download packet-mark=no-mark parent=sfp-lan queue=cake_rx
add bucket-size=0 max-limit=1550M name=cake_upload packet-mark=no-mark parent=pppoe-out1 queue=cake_tx
Plan from Bell is 1Gbps, physical cable connection is 1Gbps to one nokia fiber box.Just curious, @kenyloveg, but do you have 1gb up and down, or do you have 1.5gb up and down?
It looks like the max-limit is set to 1.5gb, so wouldn't this prevent the queue from ever being filled if you have 1gb up and down? Or is Bell 1Gbps really 1.5Gbps?Code: Select all/queue type add cake-memlimit=64.0MiB cake-rtt=60ms kind=cake name=cake_rx add cake-memlimit=64.0MiB cake-rtt=60ms kind=cake name=cake_tx /queue tree add bucket-size=0 max-limit=1550M name=cake_download packet-mark=no-mark parent=sfp-lan queue=cake_rx add bucket-size=0 max-limit=1550M name=cake_upload packet-mark=no-mark parent=pppoe-out1 queue=cake_tx