We're seeing pretty weird situation doing NAT on our Linux boxes after CCRs.
After we started using CCRs as BRASes we started seeing short sporadic traffic drops on our NAT boxes for no apparent reasons.
One CCR is 6.33.3. The other is 6.33.5.
Here's our scheme:
- A bunch of BRASes that do PPPoE+Shaping. These are several Linux boxes with accel-ppp and 2 CCR1072s.
- Some of PPPoE users need to be NATed, based on their IP addresses, which is why every BRAS has an IP firewall rule on that routes traffic to a different gate, based on user's source IP
- 2 Linux boxes do NAT based on iptables
- If we NAT only the traffic that comes from Linux BRAS boxes - everything works fine
- If we divert traffic from a CCR to a NAT box - we start seeing traffic drops.
It's not the matter of traffic load. Neither traffic, nor PPS are anyway near the critical level. No matter how much traffic we divert. The only relevant thing if it's a CCR being diverted.
And what's even more interesting - NAT drops ALL traffic, including the one that comes from Linux boxes (which we know works fine on its own).
At first I thought it was a routing issue. So we took one NAT box and diverted only a single CCR traffic there. We disabled OSPF completely on a NAT box and added a static route. And the issue is still there.
What on earth could that be?
Any ideas?