After discussion with the Traffic team, this task is to track the testing and, if successful/valuable, production deployment of a system to offload ICMP pings to a dedicated host.
Large amount of ICMP echo request toward our main IPs, usually used by people and machines to test their connectivity to the Internet, has been causing issue. For example reaching rate limiters thresholds (set to not overwhelm our servers) and dropping monitoring ICMP requests.
1st part, to deploy a test instance in eqiad
- Get a VM in a private vlan (ping1001.eqiad.wmnet)
- Reserve a test public IP in the LVS range in DNS (208.80.154.225)
- Assign the IP to the VM's loopback IP
- Add a firewall rule on cr1/2-eqiad to redirect icmp requests (before term default)
set firewall family inet filter border-in4 term offload-ping4 from protocol icmp set firewall family inet filter border-in4 term offload-ping4 from icmp-type echo-request set firewall family inet filter border-in4 term offload-ping4 from destination-address 208.80.154.225 set firewall family inet filter border-in4 term offload-ping4 then next-ip 10.64.32.31
- From there pings sent to the test IP should be replied by the the VM. (Confirmed)
Monitoring
Internally, pings to a LVS VIP should be replied by host behind the LVS
Externally, they should be replied by the VM.
- Add VM to standard monitoring (Icinga, Prometheus, etc)
- Ensure external monitoring does ICMP checks for the LVS VIPs (and not balanced hostname)
- Ensure availability of the service hosted on the LVS VIP is externally monitored by a check different than ICMP
The previous 2 points are to prevent people (and availability stats) to think the actual service (eg. wikipedia.org) is down, when only the ICMP server is.
- Write documentation (eg. how to disable redirect) - https://wikitech.wikimedia.org/wiki/Ping_offload
- Optional: Write an ICMP dashboard in Grafana - https://grafana.wikimedia.org/dashboard/db/ping-offload
2nd part, catch real ICMP traffic in eqiad
- Write puppet scaffolding - https://gerrit.wikimedia.org/r/#/c/424151/
- Assign 208.80.154.224 (text-lb.eqiad.wikimedia.org) to the VM's loopback IP
- Update the cr1/2-eqiad firewall rule
- Verify monitoring is happy
- Decommission the test VIP
3rd part, duplicate in codfw
- Get a VM in a private vlan (ping2001.codfw.wmnet)
- Add VM to standard monitoring (Icinga, Prometheus, etc)
- Ensure external monitoring does ICMP checks for the LVS VIPs (and not balanced hostname)
- Ensure availability of the service hosted on the LVS VIP is externally monitored by a check different than ICMP
- Assign 208.80.153.224 (text-lb.eqiad.wikimedia.org) to the VM's loopback IP
- Update the cr1/2-codfw cr1-eqdfw firewall rule
- Verify monitoring is happy
4th part, deploy to POPs
- Either order dedicated hardware or wait for VM solution to be available on the site.
- Duplicate to puppet
Redundancy
If required, be implemented with two hosts per sites, sharing a VIP using VRRP or BGP (preferred). On day 1 or at a later iteration.
Caveats
- Results could be considered as "lying", as pings to a host would be replied by a different host (might confuse troubleshooting)
- List of ping targets to "catch" needs to be maintained in 2 more tools (puppet + network automation)
- Can be alleviated with kernel's AnyIP feature (eg. lo listens on all /27 VIPs range)