Page 1 of 1
RB1000 closing tens of pppoe connections at once
Posted: Wed Dec 02, 2009 12:25 am
by jkohan
I have a problem with an RB1000 managing some 300+ pppoe connections.
At some times, it closes suddenly tens of connections. In the logs (and de Freeradius logs) the disconnection cause is loged as "User request" (contrary to the usual one of "Peer is not responding" when some modem or line is wrong) The connections come from several DSLAM at different VLANs and physical interfaces, and some from a wireless distribution made with another mikrotik.
Each VLAN has it separate PPPOE server and we observed that when the drops occur, they occur at the same PPPOE server, although all of them closes sessions in this manner from time to time.
We are not completly sure, but seems this is happening when there is a burst of failed authentications ( maybe that is a coincidence, as we couldnt repeat the problem forcing a client to misauthenticate).
We tried different versions of ROS: 3.22, 3.30 and 4.3, and 2 differents RB1000s and the problem is consistent across all of them.
Did anobody see a problem like this ? Any suggestions ?
Thanks
Javier
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Dec 08, 2009 3:50 pm
by sergejs
Javier, it is very weird, that only 10 connections are closed.
Perhaps you can enable pppoe,debug logs and get the reason, why all the 10 clients were disconnected simultaneously.
Otherwise it is very hard to guess, what could be the reason.
Just curious, do you have RADIUS server for these 300 clients?
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Dec 08, 2009 6:17 pm
by jkohan
Javier, it is very weird, that only 10 connections are closed.
Perhaps you can enable pppoe,debug logs and get the reason, why all the 10 clients were disconnected simultaneously.
Otherwise it is very hard to guess, what could be the reason.
Just curious, do you have RADIUS server for these 300 clients?
Well, maybe because English is not my native language I was not clear. When I say "tens", I refer to 23 one time, 45 next, 30 next and so. (In Spanish the word is "decenas", and allways thought the English word "tens" meant the same).
With some hundreds of users, and the fact that this disconnections happen sparsely and randomly is somewhat difficult to enable full pppoe logs without affecting performance, and then the size of the log file in my server (I log to a server with syslog) gets huge. Anyway, both in RB´s and the server logs, the disconnection cause is "User Request". The same in my Radius logs.
And yes, I use 3 RADIUS servers that get user data from replicated LDAP bases.
Some data that I think could be relevant to this issue.
1) Sample Access-Reply AV Pairs.
Tue Dec 8 00:10:31 2009
Packet-Type = Access-Accept
Mikrotik-Rate-Limit = "128k/256k 130k/258k 129k/257k 3/3 8 32k/32k"
Framed-Routing = None
Framed-Protocol = PPP
Service-Type = Framed-User
2) PPPOE Server Configuration
add authentication=pap default-profile=PPPoE disabled=no interface=\
PPPoEVLAN20 keepalive-timeout=10 max-mru=1480 max-mtu=1480 max-sessions=0 \
mrru=disabled one-session-per-host=yes service-name=XXXX
add authentication=pap default-profile=PPPoE disabled=no interface=VLAN10 \
keepalive-timeout=10 max-mru=1480 max-mtu=1480 max-sessions=0 mrru=\
disabled one-session-per-host=yes service-name=XXXX
set default change-tcp-mss=yes comment="" name=default only-one=default use-compression=default \
use-encryption=default use-vj-compression=default
add change-tcp-mss=yes comment="" dns-server=10.120.0.2,10.120.0.3,10.120.128.2 local-address=\
200.xxx.xxx.xxx name=PPPoE only-one=default remote-address=POOL-PPPoE use-compression=default \
use-encryption=default use-vj-compression=default wins-server=127.0.0.1
If someone likes to see more information, please ask.
Thanks
Javier
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Dec 29, 2009 6:34 pm
by pbel88
Peer is not responding with tens of PPPoE sessions and sometimes all sessions that disconnect intermittently is still an issue for me. I saw one live once, had about 100 users connected and suddenly everyone got disconnected with mention in log "Terminating... disconnected" the pppoe service wen't crazy... collapse, RB was still alive. Then it took about 30 seconds before everyone logged back in. I can have this issue 3 times a day or once a week, it depends on something that I don't know and it's very frustrating. Clients complaints, have about 250 users on an RB1000. Thought it was a bridge problem first, so I separated the PPPoE service trough ether interface but still same issue. Have looked around the forum but found nothing. Does someone have found a fix or do I have to switch to another solution than Mikrotik.
Regards
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Dec 29, 2009 7:00 pm
by jkohan
So, I´m not the only one exprimenting this problem, and it is a real problem when one has 300+ users angry.
Can someone @mikrotik engage in this issue ?
Re: RB1000 closing tens of pppoe connections at once
Posted: Wed Dec 30, 2009 2:36 pm
by sergejs
The best way to get it, please contact support (
support@mikrotik.com) with detailed problem description.
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon Jan 11, 2010 1:31 pm
by promind
Same problem here... I have a dozen of RB1000's which work as PPPoE servers...CPU goes for about 20 secs on 100% and all users got "terminated".
Please find any solution because I'll have to stop working with you MikroTik guys...
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Jan 15, 2010 7:06 pm
by pbel88
Found this (
http://forum.mikrotik.com/viewtopic.php ... =pppoe+bug). I've removed everything related to PCQ in my queues and all seems to be OK since 24H
. Hope this will continue. But PCQ + PPPoE service mixed together will need to be reviewed by MK.
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Jan 19, 2010 3:39 pm
by pbel88
It still crash but less often
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon Apr 26, 2010 8:23 pm
by rborz
The same problem occurs here, too. With about 100 users on RB1000 PPPoE server. All sessions disconnect simultaneously.
The RB1000 has two uplinks to two different providers using PPPoE for their connection, too. Also the uplink PPPoE sessions disconnect from time to time.
Two weeks there was no problem, no disconnects - and today all PPPoE sessions (incoming & outgoing) disappeared within 3 hours. This is horrible...
The 100 users are on one physical interface and shared across two VLANs. Each of the two uplink PPPoE sessions goes over a separate physical interface.
We tried every firmware from 3.22 up to 4.6. The same problem... we also completely exchanged the RB1000 without any success.
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue May 11, 2010 1:37 pm
by sergejs
rborz, it would be great you can contact MikroTik support (
support@mikrotik.com) with detailed problem description and support output file generated, when problem with PPPoE users is present.
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue May 11, 2010 2:29 pm
by jkohan
As SergeJS advised, I opened a ticket @mikrotik. After seeing my supout file, their suggestion was:
1) replace all your dynamic change-mss rules with one global change-mss rule.
2) check that you use latest winbox loader (cleare cache after upgrading)
3) Think about switching from dynamic simple queues to Dynamic address-list
and queue tree with PCQ
#1 was relatively easy do do.
#2 I did what was asked to do.
Up to here, the problem persists, I even tried to split users among 2 RBs, some 100+ on a 600 and some 180+ stayed in the RB1000. Less frequently, but both had massive disconnections.
#3 I don´t understand what I´m asked to do. I have hundreds of users and rely on RADIUS to pass PPPoE concentrators customer´s bandwidth parameters. Can I pass address list instead of "Mikrotik-Rate-Limit" attributes ? How ?
There is a "Mikrotik-Mark-Id". Is it for doing that ? In that case, how do I use it ?
Thanks.
Javier
Re: RB1000 closing tens of pppoe connections at once
Posted: Sat May 15, 2010 10:39 pm
by chaym
I am not using a RB1000, but x86 based RouterOS on a PowerRouter 2242. Same issue here. Terminating 600+ PPPoE customers and the queues will fail, disconnecting all users until the unit is reboot. We have tried several versions of RouterOS including 4.9 and the problem persists.
The previous suggestions from Mikrotik staff either do not work, or are not usable in our environment. (We need simple queues to assign specific bandwidth profiles a customer is paying for which is passed from our RADIUS server)
This can happen once a week or a few times a day, it is very random. It does seem to happen more often if we have multi cpu support enabled. We cannot contact Mikrotik support since we purchased our license through a 3rd party. This is becoming very frustrating for us, but even more so for our customers.
Re: RB1000 closing tens of pppoe connections at once
Posted: Sun May 16, 2010 8:34 pm
by rodolfo
I have the same problem using x86, partially resolved disabling simple queues.
You can contact mikrotik support also if you purchased licenss from 3rd parts.
Re: RB1000 closing tens of pppoe connections at once
Posted: Sun May 16, 2010 10:13 pm
by Muqatil
I am using RB1000, x86, RB433AH as PPPoE servers using RADIUS Centralized accounting and i don't encounter similar problems. I use simple queues so your issues might not be related to simple queues.
I had a similar problem a while ago, but i addressed it to a flapping wireless link. Fixed the link, fixed the issue. Did you try to look for packet losses on the path to the clients?
edit. My PPPoEs ask for interim updates, might this help?
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon May 17, 2010 2:28 am
by jkohan
edit. My PPPoEs ask for interim updates, might this help?
What is that ? Where do you set it up ?
Thanks
Javier
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon May 17, 2010 9:31 am
by Muqatil
From Documentation:
interim-update - defines time interval between communications with the router. If this time will exceed, RADIUS server will assume that this connection is down. This value is suggested to be not less than 3 minutes
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon May 17, 2010 12:46 pm
by rborz
sergejs, I already contacted support multiple times... the last hint was to upgrade 5.0 beta. But I'm afraid of doing that, as the routerboard is on a production network serving about 120 PPPoE clients at the moment.
Thinking a lot, my last thoughts yesterday were, if anybody having this issue maybe using external radius server? If this is the case, I think most of the users will use FreeRADIUS (as we do). FreeRADIUS default configuration states this:
# max_requests: The maximum number of requests which the server keeps
# track of. This should be 256 multiplied by the number of clients.
# e.g. With 4 clients, this number should be 1024.
#
# If this number is too low, then when the server becomes busy,
# it will not respond to any new requests, until the 'cleanup_delay'
# time has passed, and it has removed the old requests.
#
# If this number is set too high, then the server will use a bit more
# memory for no real benefit.
#
# If you aren't sure what it should be set to, it's better to set it
# too high than too low. Setting it to 1000 per client is probably
# the highest it should be.
#
# Useful range of values: 256 to infinity
#
max_requests = 1024
Maybe together with interim updates, this value might be to low... maybe this has something to do with the issue. But in my case with about 120 PPPoE clients and approximately 4 sip accounts per client this may lead to 600 simultaneous requests (in the worst case). Concerning this, maybe this hasn't to do anything with the connections drops... just my two cents...
EDIT: Sometimes we have bruteforce attacks with about 500 requests per second against our sip gateways... and each register/login-attempt also leads to a radius request. Now above becomes more reasonable... So the main question is:
Does PPPoE server on MikroTik drop connections if there are timeouts on interim-updates...?
EDIT: Ok, a few minutes ago - all my PPPoE sessions were gone again... so this time I checked all the logs - no brute force or something like that leading in a DoS on the radius server. So this must be another issue...
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon May 17, 2010 7:23 pm
by jkohan
So the main question is:
Does PPPoE server on MikroTik drop connections if there are timeouts on interim-updates...?
I have to mention: We do NOT use interim-updates and suffer the same problem.
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri May 21, 2010 1:54 am
by cuz2000m
I have the same problem with 2 RB1000 units. However, none of my PCQ options are in use as I have disabled all the Queues. I have 600+ angry customers and would really like a fix. Has anyone found anything that actually works, or have any information from Mikrotik about what the problem could be? The only thing that I have noticed thus far is that in the PPPoE Servers tab, some of the interfaces display "unknown" when the sessions are dropped.
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri May 21, 2010 12:06 pm
by cuz2000m
Update to what I mentioned earlier. I believe this problem to be associated to the 4.x firmware. I had realized a similar problem in a previous RB1000 (It has firmware 4.6) which has now been re-deployed in another area of our network. I had pulled all the configs and uploaded them to a new RB1000 with 3.x firmware and had no PPPoE connection drops during that time. However i had a fairly high CPU usage which prompted me to upgrade to the latest firmware 4.9. And a couple of days afterwards I have started to experience the same problem. I have reverted to firmware 3.30 and will test it for a few days. If will post the results of my little test.
Corey.
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue May 25, 2010 9:01 am
by rborz
We have this problem since the day we bought our first RouterBOARD from MikroTik (RB1000) 1,5 years ago. Well, if I remember correctly, the board had RouterOS v3.22 installed. Since that day we're always looking forward for the next firmware update hoping they fixed it.
cuz2000m: The support told me to upgrade to v5.x beta as they may changed something with the PPPoE stuff. But as my RB1000 is about 450 kilometers away I can't do the upgrade hoping nothing bad will happen. Maybe you can check it out if v5.x will fix it as you're next to the boards? We've to buy another solution if MikroTik doesn't get this fixed soon!
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue May 25, 2010 10:26 pm
by 820
We have 2 PowerRouter 732's and an RB1000 that keep having "spontaneous rebooting". The PowerRouters keep crashing and we have had to - disable 1 of the CPU's, disable "multi threading" and try v5.x beta and v4.9 for each unit - without success and we continue to crash every 1-2 days, which has been going on for several weeks now. The RB1000 is stable for us though.
Mikrotik support have been given all the support logs and we are urgently waiting for a quick reply. Mikrotik have a great product (we are very happy) but Iv'e been reading many threads regarding this and it is a big issue that needs fixing.
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon May 31, 2010 4:24 pm
by cuz2000m
Hi All,
As promised, I just wanted to update you on the progress so far and that is to say that I have not yet experienced the same problem since I downgraded to 3.30.
I'm sorry rborz but I can't upgrade to the 5.x beta at this time. It would be really irresponsible for me to do this and needless to say, I have already lost a few customers because of these outages.
Maybe someone else who had the problem with Firmware version 4.x can upgrade or downgrade to version 3.30 like I did and maybe confirm a possible problem with the 4.x line or firmwares.
Corey.
Re: RB1000 closing tens of pppoe connections at once
Posted: Wed Jun 09, 2010 4:12 pm
by user47
hi all
I am having very similar problems where is run a RB1000 with 500+ pppoe sesions and all the sessions drops at once. When this happens all sesions will immediately start to reconnect without problems.
Little bit background on my setup. I make use of a radius server for the AAA functions that uses the RB1000 as the pppoe server. when the client connects i simple queue is created to provide bandwith limit to the session on a per session basis.
cuz2000m i have also downgraded to 3.30 with no change in the problem. Have gone through version 3.3, 4.5, 4.6 recently and some other along the way. All give the same problem. Can not find any pattern in when this happens.
Have also replaced the RB1000 with a new one and problem persists (and all other common hardware between RB1000 and clients). have 4 other Routerboards also using the same radius server with no problems, however they are running a max of 150 clients. Can this be load related?
well for any other ideas im willing to try almost anything at this point have lost a few dozen clients already....
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Jun 10, 2010 3:17 pm
by cuz2000m
Hi All,
Another update. Was busy for the past few days. Well, the problem did subside with v3.30, but only for a week. After which, it started all over again. User47, I have been wondering the same thing. I have a very similar setup to you in that there is an external radius server which does our authentication but there is no queues setup on that mikrotik. I also have a few customers (around 50) on another mikrotik running 4.6 without these drops. So I was also wondering if it was a load problem. The RB1000 according to specs is suppose to handle around 5000 PPPoE sessions. CPU load is between 30-50 % and there is not much memory load at all.
I have updated to version 5.0 beta 2 which in this case seems to correct itself somewhat but I can't allow the disconnections to continue in this form. It makes us look very Unprofessional and Unreliable.
Does anyone have any ideas?
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Jun 10, 2010 11:41 pm
by omidh
hi
i have the same problem.
when clients trying to conncet pppoe they gets error 691 and in mikrotik log says "peer is not responding" and afer two or more retry, clients can conncet.
my OS version is 4.9
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Jul 13, 2010 12:49 pm
by Nando_lavras
Same problem here.... after a time with all clients connected all connections are terminated... after 2~3 seconds all clients reconnect normally..... i have send the supout.rif to mikrotik and they not encounter no signal of software crash... but the clients continues to disconnect.
Re: RB1000 closing tens of pppoe connections at once
Posted: Wed Aug 18, 2010 5:56 am
by tlcscousin
We are seeing the same issues except on two RB450G routers and a lot less clients. 70 clients in Mikrotik 69 will drop session and within seconds re-enter session.We have tried all the suggestions given here minus going to the beta firmware
We have
wifi link-->switch-->RB450G-->Engenius 2611AP-->customers
when the sessions drop all customers are in AP.They all authenticate to a central Radius server.
We have 15 Mikrotiks in use none have quite the client load of these two and never do the drop all customers thing, the two that drop are both using 4.10 firmware. We added a 5.8 link to one of the Microtiks (all are 2.4)so one has 2.4 and 5.8 customers and all drop the same.Which to me sets common point of issue with microtik. We have replaced pretty well everything on the tower AP 3 different units even went with ubiquity radio but that was the original radio and had a few issues before we converted to PPPOE and Mikrotik (errors on RX and TX which we attributed to to many customers). We are going to try downgrading to 4.5 but reading this thread not holding much hope of it being the cure.
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Aug 19, 2010 9:53 am
by lavv17
Re: RB1000 closing tens of pppoe connections at once
Posted: Sat Aug 21, 2010 11:40 pm
by tlcscousin
It almost appears ours is bandwidth usage that is the root cause of disconnects. We set everyone of the people down to 384/128 and they so far have stayed connected 4 hours longer than they used to(will see if any hit the 1 day mark).
OK definately not working dropped everybody at about 20 hours.
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Aug 24, 2010 3:09 pm
by cuz2000m
Hi,
Has anyone been able to check their logs to see if they recognize any similarities before the "crash" of the PPPoE servers. In mine, I notice that my VLAN interfaces all switch to the UP state. I don't see anything going to the DOWN state prior to this though. So I have no idea why the state changed to UP. Is anyone else logging to a syslog server that can confirm this?
Corey.
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Sep 03, 2010 1:09 am
by roneyeduardo
Hi all.
I'd like to ask everyone who reported this problem if there was no solution up to date?
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Sep 07, 2010 4:18 pm
by asterisco
Hi,
Has anyone been able to check their logs to see if they recognize any similarities before the "crash" of the PPPoE servers. In mine, I notice that my VLAN interfaces all switch to the UP state. I don't see anything going to the DOWN state prior to this though. So I have no idea why the state changed to UP. Is anyone else logging to a syslog server that can confirm this?
Corey.
Hi,
I'm having same issue with RB1100 as PPPoE concentrator and RocketM5 directly connected to a port of RB1100 in order to terminate wireless connections that connect to such AP.
Just *BEFORE* pppoe bulk disconnections I see in the RocketM5 logs:
[1404166.824000] AG7240: unit 0: phy 4 not up carrier 1
[1404166.825000] br0: port 1(eth0_real) entering disabled state
[1404168.635000] AG7240: enet unit:0 phy:4 is up...RGMii 100Mbps full duplex
[1404168.636000] AG7240: done cfg2 0x7135 ifctl 0x10000 miictrl
[1404168.636000] br0: port 1(eth0_real) entering learning state
[1404169.636000] br0: topology change detected, propagating
[1404169.636000] br0: port 1(eth0_real) entering forwarding state
This happens from time to time; not very often... Now the question is: which device is going down/up? mikrotik? ubiquity?
I have asked in the Ubiquity forums too:
http://www.ubnt.com/forum/showthread.php?t=23032
The simplest workarround I can figure is to put a switch between the pppoe and the AP in order both ends always see and ethernet link up independently of the other end.
Regards,
Antonio
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Oct 05, 2010 1:19 pm
by rborz
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Oct 05, 2010 1:36 pm
by normis
since a long time already. RB1100
Re: RB1000 closing tens of pppoe connections at once
Posted: Sat Oct 09, 2010 12:07 am
by formico
I have noticed that when the cpu reaches 100% of usage and stays there for over a couple of minutes, all the connections go down. I hope this can be helpfull for someone. Now I am tryng to install router OS on a HP DL 380 2.8 ghz quad core double cpu server, with 4 GB Ram and sas HD and equipped with ESX since Router OS doesn't support sas hd's. It seems to work fine but I am not sure that the Hardware performance is the problem.
I'll keep you all up with the results of the trial.
Re: RB1000 closing tens of pppoe connections at once
Posted: Wed Oct 13, 2010 10:07 pm
by fatty
Same problem. Replacing rb1000 with x86 machine , solved the problem.
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Oct 14, 2010 9:39 am
by normis
Did you all submit tickets to support about this issue with RB1000 as suggested above? Please tell me the ticket numbers and I will check the status.
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon Oct 18, 2010 11:30 am
by DSP
Same problem. I notice the problem about 20 sessions and it persist until now, 215 sessions. Problem does not include pptp session connected thru WAN port. What is the "ticket numbers" ?
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Oct 19, 2010 9:32 am
by normis
Same problem. I notice the problem about 20 sessions and it persist until now, 215 sessions. Problem does not include pptp session connected thru WAN port. What is the "ticket numbers" ?
email support. when they answer, in the subject of the email you will see a ticket number, like 2010101966000161
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Oct 27, 2011 7:30 pm
by hajde
Same problem hire but with x86, I try to change hardware (3x HP server: 1. ML110 G5 2. ML110 G6 and 3. DL380-G7) and problem still persist?
Contact support with suppout file long time ago, but he seed everything is ok?
Edit: On all ROS version I try, same problem.
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Mar 23, 2012 5:22 pm
by riyadiari
is there any update on this problems (3years already)
?
any "new" solution ?
i have this same problem with RB750, many x86, RB1100AH, with ROS v3.2, v3.3 ,4.1 and 5.1.
always disconnecting 40-100 PPPoE connection at a time, 3 - 5 times daily.
my "temporary" solution with the most stable PPPoE connection was using ROS v<2.9 on x86 as PPPoE server only, until now the problem never happened again . but really 2.9
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon Mar 04, 2013 11:36 am
by ferdinandbabst
Is anyone still having this issue?
We had a similar fault and I found that the issue was actually caused by Spanning tree on the switches. We have a RB1000 connected to a set of switches running spanning tree. The issue was that the customer edge ports on switches were not configured as edge ports, so when a single user disconnects pppoe, the switches do a complete spanning tree re-election and then other connect users sessions will drop because of the re-election process between the switches. Not at all a Mikrotik fault.
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Jan 29, 2015 3:35 pm
by Redrik
I have a very similar issue. My setup is:
Mikrotik Cloud core router runs a pppoe server, which has only around 70 clients so far. I've set up another Mikrotik (RB1100) as a radius server to control user access and bandwidth and to get statistics. The interface with pppoe server is connected to an unmanaged switch with 2 dslams. Both Mikrotiks run on 6.24 version.
Multiple Pppoe sessions drop simultaneously and immediately re-initiate all the time. The quantity of dropped sessions and their uptime vary with no obvious patten at all. The customers complain that they can't access the internet until they reboot their modems. But not all of them: some still can use Internet even after drop and re-create the session. And again, there is no patten here either. Log shows that termination is initiated by the client.
I increased keep-alive timeout from 10 to 120 seconds. Drops started to happen more seldom but still were there. Setting keep-alive timeout to 0 didn't make much difference.
I managed to get hold of a couple of customers before they rebooted the modems. Dynamic pppoe interface created by the server shows up and running but the customer can't get on the Internet. When I delete that interface manually, it is re-created within 4-5 seconds and the customer confirms Internet is ok after that without rebooting the modem.
Please!!! Can anyone help me with this issue?
Re: RB1000 closing tens of pppoe connections at once
Posted: Thu Aug 20, 2015 8:45 pm
by chrisw
It's rather discouraging seeing how old this thread is when I'm having the same problem here, five years later on a CCR1036-12G-4S running RouterOS v6.31.
2,000 PPPoE clients. Sometimes if a node drops during maintenance, we expect lose about 200 sessions. Those sessions do drop, and then the REST OF THE ENTIRE NETWORK drops along with them. Once PPPoE is finished dropping ALL its connections, then it starts reauthenticating people. CPU does not appear to be taxed at all when the sessions drop, and it drops sessions very slowly (2-3 per second.) When this happens, it's simply faster to yank the power cord and have it boot back up, reacquire a full route table via BGP, and reauthenticate all 2,000 PPPoE users. Otherwise, it'll take it at least 10 minutes just to drop all the sessions.
Re: RB1000 closing tens of pppoe connections at once
Posted: Mon Aug 24, 2015 9:41 pm
by hashbang
right, after so many years the problem is still there but what is the cause. I'm experiencing the same thing. One my my networks runniing pppoe likethis MT.....switch.....nanobeam.........nanobeam....switch....subscribers. The number of subscribers is low around 20. They all experience disconnection problem every now and then. I'd tested the link its giving 100mbps aggregated throughput. Still groping in dark
. My hw is rb 2011 ver 6.18
Re: RB1000 closing tens of pppoe connections at once
Posted: Wed Nov 18, 2015 2:46 am
by hugleo
chrisw,
"2,000 PPPoE clients. Sometimes if a node drops during maintenance, we expect lose about 200 sessions. Those sessions do drop, and then the REST OF THE ENTIRE NETWORK drops along with them. Once PPPoE is finished dropping ALL its connections, then it starts reauthenticating people."
We have the exactly same problem, we are using CCR1036-12G-4S.
Can the mikrotik team do something about it?
Re: RB1000 closing tens of pppoe connections at once
Posted: Wed Jan 20, 2016 8:29 pm
by hugleo
And continues happing here...
No solution until now.
Mikrotik support can you say if we have something new about it?
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Feb 05, 2016 4:49 am
by genie
PPPoE clients get disconnected though they are logged in.This happens when you use PPP--->Active connection tab to check status of PPPoE users.This is a old known bug which Mikrotik is yet to resolve.Don't use this tab,instead user PPP--->Interface tab to get active connection details,from there select a user click on it you can get details of uptime and IP address as well or as a workaround add additional columns in Interface tab to display uptime and IP address.
Well this is one of the causes for the unexplained disconnections.Hope this helps.
Genie.
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Feb 05, 2016 11:48 am
by hugleo
It can be another bug.
The bug I talking is that if same if I use multiple pppoe server in differents interfaces and if I just disconnect the cable CPU load grows while disconnecting and reconnecting the clients. Due of the fact mikrotik does not paralelize CPU in this case all others pppoe clients start disconnecting by timeout because mikrotik does not send pppoe echo message. The whole process last 8 minutes in that state and stabilize again after that.
Mikrotik says that will solve this problemas is router os v7. For now I will try to change all pppoe clients echo tolerance to something like 3 minutes to see If can minimize the damages.
Re: RB1000 closing tens of pppoe connections at once
Posted: Tue Mar 22, 2016 4:02 am
by hugleo
Maybe will be solved in 6.35?
6.35rc Changelog:
*) ppp - fixed crash when ppp interface gets disconnected and user gets authenticated at the same time (most probable with slow RADIUS server);
*) ppp - fixed memory leak high number of pppoe clients to the same server;
*) ppp - do not crash when received multiple CBCP packets;
*) ppp - close connection if peer wants to re-authenticate;
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Aug 14, 2020 9:06 am
by flameproof
I
hate reviving old threads from years past, but this one IMHO is worth keeping alive. We have the same issue with 1300 PPPoE sessions on a CCR1702. We are able to reliably reproduce this:
1. Drop a number of customers by:
a) Rebooting a downstream switch
OR
b) Rebooting a PtP AirFiber serving a downstream switch
OR
c) Pull one of the ports on the bridge serving PPPoE on the CCR
2. We will see traffic drop according to the segment lost.
3. When the disconnect completes, traffic resumes.
4. About 2 minutes after traffic resumes, ALL traffic stops at the CCR, and PPPoE sessions start dropping - sometimes it's all of them, sometimes only a portion.
clipboard-image-6.png
CCR remains accessible during these events, but no amount of CPU profiling has pointed to anything specific. Mikrotik support ended up shrugging and said "our hardware won't support your current configuration" without further details. The interface is a 10Gbps fiber, so this is not a "you're choking your 1G link".
I think this problem is embedded deeply in the core of the operating system, and thus has not been fixed during years of development, upgrades and fixes.
At this point, we are looking at alternative vendors, at a loss of thousands of dollars to Mikrotik (we are a credible ISP in Eastern Africa with some 15.000 customers... and plans for growth to 200.000 customers).
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Aug 14, 2020 11:01 am
by glueck05
Hello,
do not use CCR for more than 1000 PPPoP Customers per device. And under all circumstances disable connection tracking on CCR. We use X86 Devices which could handle >= 4000 PPPoE Customers but also with connection tracking disabled.
We use these devices:
1) X86_64:
http://www.lannerinc.com/products/netwo ... s/nca-5510
2) 8 Port-Copper Port:
http://www.lannerinc.com/products/x86-n ... cm-igm801a
3) 4 Port SFP+:
https://www.landitec.com/products/x86-n ... 05a-detail
regards,
glueck
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Aug 14, 2020 1:18 pm
by rodolfo
@glueck
The problem of traffic drops is caused by the cpu at 100%, occupied to remove connections of pppoe users dropped in connection table.
This can occur for some minutes in which the router could be unreachable.
Now we have one CCR1036 with 4000 pppoe users (distributed in 200 pppoe servers).
We have resolved as follows:
1. remove all firewall and ebgp functions from pppoe server
2. disable connection tracking from pppoe server
We no more use x86 as ppoe server.
Hih
I hate reviving old threads from years past, but this one IMHO is worth keeping alive. We have the same issue with 1300 PPPoE sessions on a CCR1702. We are able to reliably reproduce this:
1. Drop a number of customers by:
a) Rebooting a downstream switch
OR
b) Rebooting a PtP AirFiber serving a downstream switch
OR
c) Pull one of the ports on the bridge serving PPPoE on the CCR
2. We will see traffic drop according to the segment lost.
3. When the disconnect completes, traffic resumes.
4. About 2 minutes after traffic resumes, ALL traffic stops at the CCR, and PPPoE sessions start dropping - sometimes it's all of them, sometimes only a portion.
clipboard-image-6.png
CCR remains accessible during these events, but no amount of CPU profiling has pointed to anything specific. Mikrotik support ended up shrugging and said "our hardware won't support your current configuration" without further details. The interface is a 10Gbps fiber, so this is not a "you're choking your 1G link".
I think this problem is embedded deeply in the core of the operating system, and thus has not been fixed during years of development, upgrades and fixes.
At this point, we are looking at alternative vendors, at a loss of thousands of dollars to Mikrotik (we are a credible ISP in Eastern Africa with some 15.000 customers... and plans for growth to 200.000 customers).
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Aug 14, 2020 1:48 pm
by flameproof
@glueck @rodolfo
Thanks for your input and suggestions - we are definitely contemplating the x86 metal + dedicated PPPoE stack as an option.
On the connection tracking disabled - how would you handle dynamic rate limiting without it? We use a simple queue for each CPE session, assigned based on RADIUS response (and the service level set on the customer DB). We also (in some cases) use mangle rules to direct traffic where we have more than one upstream link (e.g. two parallel 1Gbps fibers).
Re: RB1000 closing tens of pppoe connections at once
Posted: Fri Aug 14, 2020 4:53 pm
by rodolfo
For dynamic rate limiting, we use a simple queue for each user session, assigned based on RADIUS response.
For mangle, be shure to mangle in raw queues, also if we prefere to demand mangle/route/bgp/firewall to a border routerboard different from pppoe server (also because is useful tu have at least two redundanto pppoe server)
@glueck @rodolfo
Thanks for your input and suggestions - we are definitely contemplating the x86 metal + dedicated PPPoE stack as an option.
On the connection tracking disabled - how would you handle dynamic rate limiting without it? We use a simple queue for each CPE session, assigned based on RADIUS response (and the service level set on the customer DB). We also (in some cases) use mangle rules to direct traffic where we have more than one upstream link (e.g. two parallel 1Gbps fibers).