Page MenuHomePhabricator

BGP session between pfw clusters flapping
Closed, InvalidPublic

Description

830+ flaps so far. This causes cross DC traffic (like rsync) to fail.

That BGP session goes over an ipsec tunnel.
The following configuration has been added on both sides. It's most likely not the reason of the flap, but should be there nonetheless.

  • allow IKE on lo0 (where ike terminates)
  • specify host-inbound-traffic per interface (I've seen cases where the "global to the security zone" config was not applied properly)
  • Set ipsec tunnel monitoring (for faster monitoring)
[edit security zones security-zone untrust interfaces lo0.0 host-inbound-traffic system-services]
        dns { ... }
+       ike;
[edit security zones security-zone vpn-codfw interfaces st0.0]
+      host-inbound-traffic {
+          system-services {
+              ping;
+              traceroute;
+          }
+          protocols {
+              pim;
+              igmp;
+              bgp;
+          }
+      }
[edit security ipsec vpn vpn-x-ipsec-vpn]
+     vpn-monitor {
+         optimized;
+     }

I also added extra debug logs (starting with ike) on both sides:

# show security ike traceoptions 
file ike-debug.log size 5m files 3;
flag all;

As well as

[edit security ipsec]
+    traceoptions {
+        flag all;
+    }

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.
Qse24h closed this task as a duplicate of T164723: New git repository: <repo name>.

Nothing explicit in the logs. I've open case 2017-0511-0002 with JTAC

Going to replace the pfw soon, not worth investigating more, unless it's causing visible issues.