Community discussions

MikroTik App
 
drevil23
just joined
Topic Author
Posts: 16
Joined: Thu Jun 08, 2017 10:01 pm

PCC Load sharing troubleshooting

Thu Jun 08, 2017 11:09 pm

Hi everyone,

I have a Mikrotik Routerboard RB3011UIAS-RM, v. 6.39.2, and I have tried to set up a PCC Load Balancing between 2 different WAN (same technology, same provider). And it almost works as it should, keyword here "Almost". I have 2 wan connections, fiber-optic, with 50mbps download, 25 mbps upload.

I have tried several different setup I took from the web, and with all 3 I got the same result. I have a very bad delay when I open a new connection, and what seems to be several packets dropped.

https://youtu.be/AZePBBbp_5w
https://blog.linitx.com/load-balancing- ... nnections/
http://mum.mikrotik.com/presentations/US12/steve.pdf
https://youtu.be/B58tCVwWDzE
http://tiktube.com/video/GEfq3hCljLoKpm ... uIlGopKGp=


Example 1 : ping google.com : The ping will fail for several seconds, 10-30 seconds, but once it get started, I get between 30-40ms and won't suffer any more packet loss, as long as I keep it active, I'm with Linux, so I don't have to put in the "-T" parameter to ping continuously.

Example 2 : Using a browser I can't access webpages, getting a "connection failed" message, but if I hit refresh several times for 10-30 seconds, then the webpage will work, and will work properly for as long as I don't let it "time-out", HTTP and HTTPS will give the same result.

I restarted from scratch several times, reinitializing the Routerboard between each try. I have tried using a bridge for all my lan ports, with a single ethernet interface, plugged into a switch, with the defconf bridge deleted, just in case it was the culprit... Same result.

I'm getting these results from all my computers, tablets, wifi connected phones, etc.

I've also noticed that when I reboot the router, PCC seems to work perfectly for few minutes. Graphing the CPU, Memory and disk show less than 1-2% usage for CPU, Plenty of memory and Disk space available.

Disabling the PCC mangling rules, the routes and the Nat rules associated with it will make everything works flawlessly on either WAN modem, and on both physical ports on the router, but without load sharing...

I have used the "No-mark" rules to prevent remarking packets, I have logged all my connections to make sure I didn't have any packet that was unassigned, local connections have also been marked INTERNAL just to make sure I don't get any unmarked connection, and made sure that the WAN rules were not "seeing" local to local connections.

I tried to diagnose also by rerouting all traffic marked WAN1 and WAN2 to WAN1 just in case. I've tried google DNS, my provider DNS, and some older DNS I remembered at the top of my head (not at the same time of course)

Anything I have tried lead me to the same result, so I'm getting completely lost here. Anyone have any idea?
 
drevil23
just joined
Topic Author
Posts: 16
Joined: Thu Jun 08, 2017 10:01 pm

Re: PCC Load sharing troubleshooting

Sat Jun 10, 2017 4:00 am

Good evening,

Since then I've tried the release candidate version, no change.

I've also tried this method : http://doc.synchroweb.com/mikrotik-load ... cc-method/ No change, but slightly better performance.

I've tried adding some queues, on each ports and general queues, to add some buffer space. Less packets loss, but still nor perfect. If I put in a very large queue, I stop packet loss almost completely, but get a 500ms ping, is I put in a small queue, I get more packet loss, but a 80ms ping, and with no queue, well 30-40ms ping but with the same result as before...

Anyone has any idea as what could cause this? Buffering problem?
 
drevil23
just joined
Topic Author
Posts: 16
Joined: Thu Jun 08, 2017 10:01 pm

Re: PCC Load sharing troubleshooting

Sat Jun 10, 2017 4:30 am

Hi again,

Here is my full config.

[admin@user] > /export
# jun/09/2017 21:12:17 by RouterOS 6.40rc19
# software id = RJ12-WFUL
#
/interface ethernet
set [ find default-name=ether1 ] name=WAN1
set [ find default-name=ether2 ] name=WAN2
set [ find default-name=ether3 ] name=ether3-Master
set [ find default-name=ether4 ] master-port=ether3-Master
set [ find default-name=ether5 ] master-port=ether3-Master
set [ find default-name=ether6 ] name=ether6-Master
set [ find default-name=ether7 ] master-port=ether6-Master
set [ find default-name=ether8 ] master-port=ether6-Master
set [ find default-name=ether9 ] master-port=ether6-Master
set [ find default-name=ether10 ] master-port=ether6-Master
/ip neighbor discovery
set WAN1 discover=no
set WAN2 discover=no
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip pool
add name=default-dhcp ranges=192.168.88.10-192.168.88.254
add name=home ranges=192.168.100.100-192.168.100.250
add name=dhcp_pool2 ranges=192.168.100.100-192.168.100.250
/ip dhcp-server
add address-pool=default-dhcp disabled=no interface=ether6-Master name=defconf
add address-pool=dhcp_pool2 disabled=no interface=ether3-Master lease-time=52w1d name=dhcp1
/system logging action
set 3 remote=192.168.100.199
/ip address
add address=192.168.100.1/24 interface=ether3-Master network=192.168.100.0
add address=192.168.88.1/24 interface=ether6-Master network=192.168.88.0
/ip cloud
set ddns-enabled=yes
/ip dhcp-client
add add-default-route=no comment=defconf dhcp-options=hostname,clientid disabled=no interface=WAN1 use-peer-dns=no use-peer-ntp=no
add add-default-route=no dhcp-options=hostname,clientid disabled=no interface=WAN2 use-peer-dns=no use-peer-ntp=no
/ip dhcp-server lease
add address=192.168.100.199 mac-address=60:A4:4C:36:FF:3D
add address=192.168.100.187 client-id="Pioneer VSX-830" mac-address=74:5E:1C:AF:93:25
add address=192.168.100.190 client-id=PO2117 mac-address=B4:B6:76:1E:19:23
add address=192.168.100.197 client-id=test-virtual-machine mac-address=00:0C:29:71:AD:D7
/ip dhcp-server network
add address=192.168.88.0/24 comment=defconf gateway=192.168.88.1
add address=192.168.100.0/24 gateway=192.168.100.1 netmask=24
/ip dns
set allow-remote-requests=yes cache-size=4096KiB max-concurrent-queries=100000 max-concurrent-tcp-sessions=20000 query-server-timeout=100ms query-total-timeout=500ms servers=\
8.8.8.8,8.8.4.4
/ip dns static
add address=192.168.88.1 name=router
/ip firewall filter
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related
add action=accept chain=forward comment="defconf: accept established,related" connection-state=established,related
add action=accept chain=input comment="defconf: accept ICMP" protocol=icmp
add action=accept chain=input comment="defconf: accept established,related" connection-state=established,related
add action=accept chain=input dst-port=443 in-interface=WAN1 protocol=tcp
add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed" connection-nat-state=!dstnat connection-state=new in-interface=WAN1
add action=drop chain=input comment="defconf: drop all from WAN" in-interface=WAN1
add action=drop chain=forward comment="defconf: drop invalid" connection-state=invalid
/ip firewall mangle
add action=mark-connection chain=prerouting connection-mark=no-mark dst-address=192.168.100.0/24 new-connection-mark=INTERNAL passthrough=yes src-address=192.168.100.0/24
add action=accept chain=output connection-mark=INTERNAL
add action=mark-connection chain=input connection-mark=no-mark in-interface=WAN1 new-connection-mark=WAN1_CONN passthrough=yes
add action=mark-connection chain=input connection-mark=no-mark in-interface=WAN2 new-connection-mark=WAN2_CONN passthrough=yes
add action=mark-routing chain=output connection-mark=WAN1_CONN new-routing-mark=WAN1_OUT passthrough=no
add action=mark-routing chain=output connection-mark=WAN2_CONN new-routing-mark=WAN2_OUT passthrough=no
add action=mark-connection chain=prerouting dst-address-type=!local in-interface=ether3-Master new-connection-mark=WAN1_CONN passthrough=yes per-connection-classifier=\
both-addresses:2/0
add action=mark-connection chain=prerouting dst-address-type=!local in-interface=ether3-Master new-connection-mark=WAN2_CONN passthrough=yes per-connection-classifier=\
both-addresses:2/1
add action=mark-routing chain=prerouting connection-mark=WAN1_CONN in-interface=ether3-Master new-routing-mark=WAN1_OUT passthrough=yes
add action=mark-routing chain=prerouting connection-mark=WAN2_CONN in-interface=ether3-Master new-routing-mark=WAN2_OUT passthrough=yes
/ip firewall nat
add action=masquerade chain=srcnat comment="defconf: masquerade" out-interface=WAN1
add action=masquerade chain=srcnat out-interface=WAN1
add action=masquerade chain=srcnat out-interface=WAN2
add action=dst-nat chain=dstnat dst-port=21 protocol=tcp to-addresses=192.168.100.199
add action=dst-nat chain=dstnat dst-port=21 protocol=udp to-addresses=192.168.100.199
add action=dst-nat chain=dstnat dst-port=990 protocol=tcp to-addresses=192.168.100.199
add action=masquerade chain=srcnat dst-port=21 protocol=tcp src-address=192.168.100.0/24
add action=dst-nat chain=dstnat dst-port=50502-50550 protocol=tcp to-addresses=192.168.100.199
add action=dst-nat chain=dstnat dst-port=23 protocol=tcp to-addresses=192.168.100.199
add action=dst-nat chain=dstnat dst-port=23 protocol=udp to-addresses=192.168.100.199
/ip route
add check-gateway=ping distance=1 gateway=WAN1 routing-mark=WAN1_OUT
add distance=1 gateway=WAN2 routing-mark=WAN2_OUT
add check-gateway=ping distance=1 gateway=WAN1
add check-gateway=ping distance=2 gateway=WAN2
/system clock
set time-zone-name=America/Toronto
/system identity
set name=user
/system logging
set 0 action=remote
set 1 action=remote
set 2 action=remote
set 3 action=remote
/system ntp client
set enabled=yes server-dns-names=ntp1.dlink.com
/system package update
set channel=release-candidate
/system scheduler
add interval=5m name="no-ip schedule" on-event=no-ip policy=read,write,test start-time=startup
/system script
add name=no-ip owner=admin policy=read,write source="# No-IP, DNSdynamic update\
\n\
\n#--------------- Change Values in this section to match your setup ------------------\
\n\
\n#User account info\
\n:local noipuser \"myusername\"\ (I edited these info before posting this)
\n:local noippass \"mypassword\"\ (I edited these info before posting this)
\n:local noiphost \"mywebsite.com\"\ (I edited these info before posting this)
\n\
\n# Change to the name of interface that gets the dynamic IP address\
\n:local inetinterface \"WAN1\"\
\n\
\n#-------------------No more changes needed--------------------------------\
\n:local previousIP [:resolve \$noiphost];\
\n:log info \"Current IP address on no-ip is : \$previousIP\"\
\n\
\n:if ([/interface get \$inetinterface value-name=running]) do={\
\n# Get the current IP on the interface\
\n :local currentIP [/ip address get [find interface=\"\$inetinterface\" disabled=no] address]\
\n\
\n# Strip the net mask off the IP address\
\n :for i from=( [:len \$currentIP] - 1) to=0 do={\
\n :if ( [:pick \$currentIP \$i] = \"/\") do={ \
\n :set currentIP [:pick \$currentIP 0 \$i]\
\n } \
\n }\
\n\
\n:log info \"Local WAN IP address is : \$currentIP\"\
\n\
\n :if (\$currentIP != \$previousIP) do={\
\n :log info \"IP: Current IP \$currentIP is not equal to previous IP \$previousIP, update needed\"\
\n\
\n# The update No-Ip URL. Note the \"\\3F\" is hex for question mark (\?). Required since \? is a special character in commands.\
\n :local url \"http://dynupdate.no-ip.com/nic/update\\ ... $currentIP\"\
\n :local noiphostarray\
\n :set noiphostarray [:toarray \$noiphost]\
\n :foreach host in=\$noiphostarray do={\
\n :log info \"No-IP: Sending update for \$host\"\
\n /tool fetch url=(\$url . \"&hostname=\$host\") user=\$noipuser password=\$noippass mode=http dst-path=(\"no-ip_ddns_update-\" . \$host . \".txt\")\
\n :log info \"No-IP: Host \$host updated on No-IP with IP \$currentIP\"\
\n }\
\n\
\n# The update dnsdynamic URL. Note the \"\\3F\" is hex for question mark (\?). Required since \? is a special character in commands.\
\n :local url \"https://www.dnsdynamic.org/nic/nic/upda ... $currentIP\"\
\n :local dnsdynamichostarray\
\n :set dnsdynamichostarray [:toarray \$dnsdynamichost]\
\n :foreach host in=\$dnsdynamichostarray do={\
\n :log info \"dnsdynamic: Sending update for \$host\"\
\n /tool fetch url=(\$url . \"&hostname=\$host\") user=\$dnsdynamicuser password=\$dnsdynamicpass mode=http dst-path=(\"dnsdynamic_ddns_update-\" . \$host . \".t\
xt\")\
\n :log info \"dnsdynamic: Host \$host updated on dnsdynamic with IP \$currentIP\"\
\n }\
\n } else={\
\n :log info \"IP: Previous IP \$previousIP is equal to current IP \$previousIP, no update needed\"\
\n }\
\n} else={\
\n :log info \"IP: \$inetinterface is not currently running, so therefore will not update.\"\
\n}"
/tool e-mail
set address=smtp.gmail.com from=myemail.com password=mypassword port=587 start-tls=tls-only user=myemail.com
/tool graphing interface
add interface=WAN1
add interface=WAN2
add interface=ether3-Master
add interface=ether4
/tool graphing queue
add
/tool graphing resource
add
/tool mac-server
set [ find default=yes ] disabled=yes
add
/tool mac-server mac-winbox
set [ find default=yes ] disabled=yes
add
[admin@user] >

hope this can help out figuring my problem.

Thank you.
 
drevil23
just joined
Topic Author
Posts: 16
Joined: Thu Jun 08, 2017 10:01 pm

Re: PCC Load sharing troubleshooting

Sat Jun 10, 2017 7:02 am

After watching this video https://youtu.be/3LmQYIQ5RoA?t=9m10s I disabled fasttrack, but no effect.

Then after reading another post on this forum about a problem with an EOIP configuration, I started tools/packet sniffer, and it is better, a whole lot better... But still some delay on pages and packets loss at a beginning of a ping.
 
drevil23
just joined
Topic Author
Posts: 16
Joined: Thu Jun 08, 2017 10:01 pm

Re: PCC Load sharing troubleshooting

Sat Jun 10, 2017 5:33 pm

Here is an example of a ping failing at first.

PING yahoo.com (206.190.36.45) 56(84) bytes of data.
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=7 ttl=50 time=2095 ms
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=8 ttl=50 time=1088 ms
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=9 ttl=50 time=26.7 ms
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=10 ttl=50 time=26.7 ms
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=11 ttl=50 time=26.8 ms
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=12 ttl=50 time=35 ms
64 bytes from ir1.fp.vip.gq1.yahoo.com (206.190.36.45): icmp_seq=13 ttl=50 time=31.7 ms
^C
--- yahoo.com ping statistics ---
13 packets transmitted, 7 received, 46% packet loss, time 12037ms
rtt min/avg/max/mdev = 26.716/522.963/2095.383/727.642 ms, pipe 3

Took about 8 seconds before the first response, then 2 slow ones then somewhat better results

But then if I ping again in little time between stopping and starting another ping, not even the same website I get
PING google.com (172.217.2.174) 56(84) bytes of data.
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=1 ttl=55 time=2025 ms
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=2 ttl=55 time=1017 ms
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=3 ttl=55 time=25.2 ms
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=4 ttl=55 time=23.9 ms
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=5 ttl=55 time=24.1 ms
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=6 ttl=55 time=24.6 ms
64 bytes from yyz10s06-in-f14.1e100.net (172.217.2.174): icmp_seq=7 ttl=55 time=25.0 ms
^C
--- google.com ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6022ms
rtt min/avg/max/mdev = 23.910/452.268/2025.798/727.960 ms, pipe 3

and

PING gmail-smtp-msa.l.google.com (74.125.126.108) 56(84) bytes of data.
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=2 ttl=43 time=1036 ms
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=1 ttl=43 time=2043 ms
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=3 ttl=43 time=38.1 ms
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=4 ttl=43 time=41.0 ms
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=5 ttl=43 time=40.4 ms
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=6 ttl=43 time=38.7 ms
64 bytes from ik-in-f108.1e100.net (74.125.126.108): icmp_seq=7 ttl=43 time=38.8 ms
^C
--- gmail-smtp-msa.l.google.com ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6020ms
rtt min/avg/max/mdev = 38.199/468.120/2043.205/729.252 ms, pipe 3

Still a long delay for the first ones, but no more packet loss.

Anyone have a clue?
 
drevil23
just joined
Topic Author
Posts: 16
Joined: Thu Jun 08, 2017 10:01 pm

Re: PCC Load sharing troubleshooting

Sat Jun 17, 2017 11:28 pm

Finally got rid of the dual wan configuration, and got a better provider instead.

Never got to get it working properly in the end, but I found out that the problem occurred only when I had more than 200-250 concurrent connections, when I had less, the problem seemed to vanish...