Page 1 of 1
IPSec Phase 1 fails on restart, multiple IPs
Posted: Wed Nov 04, 2015 4:22 am
by joelwhrs
I am having an issue with Phase 1 of 2 IPSec connections failing on a router restart. It is showing as a Phase 1 timeout error.
As soon as I disable all external IP addresses (there are 4, all in the same subnet) except for the IP being used by the IPSec connection, it works. I can re-enable these IP addresses and it continues working. I can even disable the IPSec connection, flush the remote peers, kill all the IPSec connections and it works as soon as I re-enable the IPSec connection. Phase 1 is configured with a Src. Address.
Any ideas?
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Sat Nov 07, 2015 3:53 am
by joelwhrs
I tried adding some routes to the remote IP address with a preferred IP of the one that I want it to use. This didn't make any difference except when I disabled my IP addresses, my IPSec connection didn't come up again. I had to remove the routes, disable the IP addresses and then restart for it to work. Is this a configuration issue or a bug? It's quite unhandy to always have to go through this after a restart.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Sat Nov 07, 2015 11:20 am
by pe1chl
Maybe you forgot to allow UDP port 500 and/or protocol ESP/AH for input?
It will work ok when a router makes the outgoing connection and traffic keeps flowing, due to the ESTABLISHED rule, but when one side is rebooted the link may be dead.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Mon Nov 09, 2015 3:02 pm
by joelwhrs
Maybe you forgot to allow UDP port 500 and/or protocol ESP/AH for input?
It will work ok when a router makes the outgoing connection and traffic keeps flowing, due to the ESTABLISHED rule, but when one side is rebooted the link may be dead.
There is a rule for this. I was suspecting this as well but it only reconnects when I disable all IP's except for the one that's used by the IPSec connection. If this were the case it would also block it with those IP's disabled.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Thu Nov 12, 2015 2:56 pm
by joelwhrs
Should I assume this is a bug and file a bug report?
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Tue Nov 17, 2015 7:40 pm
by royalpublishing
I am on 6.33 and I have just dealt with this ongoing issue for like over a year now and I am getting extremely frustrated by this. It seems like this problem just randomly pops up once every few months or so and this morning was one of them. It's not a firewall issue here, all VPN routers have input rules to allow all traffic. Even after a reboot of all routers involved, they still won't connect which just blows my mind. Disabling the non tunnel related IP addresses, IPsec rules, flushing the SA's, and re-enabling rules didn't seem to work for me however I was not onsite and had to connect remotely via winbox and had to be extremely careful not to lock myself out. It seems like the outages where these messages are displayed in the event logs last for around 30 minutes at a time and then traffic starts passing again.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Tue Nov 17, 2015 9:20 pm
by joelwhrs
My issue is that the IPsec trunk doesn't connect at all. So far it has worked to disable all the IP addresses except for the IP address that IPsec uses. As soon as they are disable it connects and then I can re-enable everything and it stays up. I can even terminate the IPsec connection and upon re-enabling, it reconnects immediately. It's extremely frustrating and definitely seems to be a bug.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Thu Nov 19, 2015 3:37 pm
by ALDISBEHMANIS
I have more or less the same problem that cannot be solved at the moment ... at least by me
Problem is in fact that MT tries to reach gateway from lowest IP number.
For ex. if you have .3, .2, .1 on WAN and ipsec is made from .2 then MT is trying to push all traffic through .1 address to gateway. It NATs your .2 outgoing traffic to .1 and sends it out from .1 to remote end of ipsec.
How to solve this i have no idea
Tried everything i could come up with. Including ipsec-peer-local ip = .....2, adding AS routes with preferred source etc .... nothing seems to help in normal way.
To get it running disable/enable .1 ip on WAN (.2 becomes preferred output IP) and it is working till reboot when .1 comes to preferred output ip again
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Thu Nov 19, 2015 3:57 pm
by joelwhrs
I have more or less the same problem that cannot be solved at the moment ... at least by me
Problem is in fact that MT tries to reach gateway from lowest IP number.
For ex. if you have .3, .2, .1 on WAN and ipsec is made from .2 then MT is trying to push all traffic through .1 address to gateway. It NATs your .2 outgoing traffic to .1 and sends it out from .1 to remote end of ipsec.
How to solve this i have no idea
Tried everything i could come up with. Including ipsec-peer-local ip = .....2, adding AS routes with preferred source etc .... nothing seems to help in normal way.
To get it running disable/enable .1 ip on WAN (.2 becomes preferred output IP) and it is working till reboot when .1 comes to preferred output ip again
Sounds exactly like my issue. What seems really strange is the fact that there is an option to select the SA Source Address within Phase 1 of the IPSec rule. I would have imagined that this is the IP that it would use as the source address.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Thu Nov 19, 2015 4:34 pm
by ALDISBEHMANIS
It actually is replay IP but this MF NATs your source ip to his default ip for reaching GW :/
I have no idea how to change this situation
Using "smallest" ip for ipsecs is simply stupid idea .....even it will work .....
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Thu Nov 19, 2015 7:33 pm
by ALDISBEHMANIS
ok. i got the solution!
1) Probably all your ip's on WAN have equal mask .... that is wrong. All except one has to have /32 (assuming all of them have same gateway ip)
2.0) Firewall - NAT add rule on top (before your masquerade) src-nat dest-addr <your remote peer ip> protocol 50 action=accept
2.1) Firewall - NAT add rule on top (before your masquerade) src-nat dest-addr <your remote peer ip> protocol 17 port 500 action=accept
3) Of course don't forget to add one more accept rule before masquerade:
source-addr local-subnet remote-addr remote-subnet action=accept <-- lets your packets enter tunnel
4) Check that you have filter rules that accept ipsec protocols and ports
Don't forget to reboot both routers so all wrong connections (from wrong IPs) get killed. Or kill them manually on both ends.
....fuck .... it took me 2 days to get it running
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Sat Nov 21, 2015 8:41 pm
by joelwhrs
Perfect! I had a value for network set on the Address list as well. I had to remove that when I took the /28 subnet off or it wouldn't communicate to my gateway.
Thanks!!
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Fri Dec 11, 2015 4:36 pm
by royalpublishing
ok. i got the solution!
1) Probably all your ip's on WAN have equal mask .... that is wrong. All except one has to have /32 (assuming all of them have same gateway ip)
2.0) Firewall - NAT add rule on top (before your masquerade) src-nat dest-addr <your remote peer ip> protocol 50 action=accept
2.1) Firewall - NAT add rule on top (before your masquerade) src-nat dest-addr <your remote peer ip> protocol 17 port 500 action=accept
3) Of course don't forget to add one more accept rule before masquerade:
source-addr local-subnet remote-addr remote-subnet action=accept <-- lets your packets enter tunnel
4) Check that you have filter rules that accept ipsec protocols and ports
Don't forget to reboot both routers so all wrong connections (from wrong IPs) get killed. Or kill them manually on both ends.
....f**k .... it took me 2 days to get it running
Crap, I still seem to have this problem even though all of my additional static NAT IP addresses are already using a /32 and the only one that actually uses the correct subnet mask is the WAN IP of that interface. Also, just to clear things up, on your 2.1) statement, you didn't specify whether the port 500 on the NAT rule was src or dst port.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Fri Dec 11, 2015 4:46 pm
by joelwhrs
ok. i got the solution!
1) Probably all your ip's on WAN have equal mask .... that is wrong. All except one has to have /32 (assuming all of them have same gateway ip)
2.0) Firewall - NAT add rule on top (before your masquerade) src-nat dest-addr <your remote peer ip> protocol 50 action=accept
2.1) Firewall - NAT add rule on top (before your masquerade) src-nat dest-addr <your remote peer ip> protocol 17 port 500 action=accept
3) Of course don't forget to add one more accept rule before masquerade:
source-addr local-subnet remote-addr remote-subnet action=accept <-- lets your packets enter tunnel
4) Check that you have filter rules that accept ipsec protocols and ports
Don't forget to reboot both routers so all wrong connections (from wrong IPs) get killed. Or kill them manually on both ends.
....f**k .... it took me 2 days to get it running
Crap, I still seem to have this problem even though all of my additional static NAT IP addresses are already using a /32 and the only one that actually uses the correct subnet mask is the WAN IP of that interface. Also, just to clear things up, on your 2.1) statement, you didn't specify whether the port 500 on the NAT rule was src or dst port.
I setup mine for any port. Also check your addresses and make sure that the address entered in the "Network" field is the same as the address entered in the "Address" field, minus the subnet mask. This was causing most of my problems.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Fri Dec 11, 2015 6:18 pm
by royalpublishing
Just out of curiosity, are you using Dead Peer Detection on your IPSec Peers? I have it disabled on my end and I'm wondering if any time there is a drop out on the network it could have anything to do with my issue.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Sat Dec 12, 2015 6:51 pm
by joelwhrs
Dead peer detection is disabled on mine.
What exactly is happening with your connection?
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Mon Dec 14, 2015 5:41 pm
by royalpublishing
Dead peer detection is disabled on mine.
What exactly is happening with your connection?
Every once in a while, I'll get errors like this in the log and the VPN doesn't seem to want to re-establish the connnection for long periods of time.
phase1 negotiation failed due to send error. 11.22.33.44[500]<=>44.33.22.11 053e1ceacf95ca3b:3c9b14518f30b19c
I tried adding these additional NAT rules at the top of my list, will have to wait and see if the problem comes back.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Thu Mar 24, 2016 4:08 pm
by royalpublishing
I am still having this sporadic problem after adding all of the aforementioned NAT rules etc. As I mentioned before, whenever this happens, all VPN traffic stops flowing between the two sites. I'm not sure how to troubleshoot this issue any further, does anybody have any suggestions for me to try?
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Fri Mar 31, 2017 3:32 am
by mattstephenson
I have this also on previous versions but still on 6.38.5 and at multiple different sites with RB3011.
This is usually evident at router startup, but does seem to have sporadically, perhaps when there is a drop in the connection at either end.
I have left it for hours and it still just fills up the log with errors.
By emptying the 'remote peers', the connections instantly rebuild and change to 'established'.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Sat Aug 24, 2019 7:06 pm
by sjoram
Hi all,
I just came across this after a software upgrade to ROS, so it must be a change in behaviour between versions.
I had a srcnat rule at the top of my NAT rules
chain=srcnat
src=10.0.0.0/8
dst=10.0.0.0/8
action=accept
It would appear this was masquerading the lowest IP on the WAN interface. Like others, I disabled the other IPs and it worked.
I changed the rule to
chain=srcnat
src=10.0.0.0/8
dst=10.0.0.0/8
out-interface=<PPPoEClient>
action=src-nat
to-address=<correct public IP>
Then re-enabled the other public IPs, killed the active peers and session re-established OK. This appears to be causing some side effects on the LAN side, which I'm investigating.
Re: IPSec Phase 1 fails on restart, multiple IPs
Posted: Sat Aug 24, 2019 7:46 pm
by sjoram
Update: Had to revert my change as it re-introduced a problem of traffic only being initiated one way across the IPSec tunnel.
In my setup, the IPs on my /29 subnet are only used in filter/NAT rules, so I was able to move them to another disused interface to take them off the WAN. With this in place, my original srcnat/accept rule allows two-way traffic and all input/NAT traffic for the /29 still seems to work.
This may not work for everyone, of course.