Page MenuHomePhabricator

liberica triggers one DNS query per HTTPCheck execution/realserver
Closed, ResolvedPublic

Description

given the following healthcheck config:

healthchecks:
    L7:
        type: HTTPCheck
        url: http://en.wikipedia.com/_status
        timeout: 3s
        check_period: 5s

this healthcheck configuration triggers a DNS query to en.wikipedia.com every 5s (check_period) per realserver, this has several side effects:

  • dependency on DNS servers to perform healthchecks
  • a valid hostname is required

Event Timeline

Vgutierrez triaged this task as Medium priority.Aug 19 2024, 8:34 AM
Vgutierrez moved this task from Backlog to Actively Servicing on the Traffic board.

While working on this task this I found another issue, if we target the service VIP from the load balancer we will trigger double IPIP encapsulation as soon as we have hcforwarder and IPVS working together:

09:16:44.374957 IP 10.64.130.16 > 10.64.0.153: IP 10.64.130.16 > 10.64.32.89: IP 10.64.130.16.38254 > 208.80.154.232.80: Flags [S], seq 2392269379, win 43200, options [mss 1440,
sackOK,TS val 1776728844 ecr 0,nop,wscale 9], length 0

A valid workaround for this issue is targeting the realserver IP rather than the service VIP:

09:46:24.231961 IP 172.16.14.64 > 10.64.32.89: IP 10.64.130.16.35920 > 10.64.32.89.80: Flags [P.], seq 0:170, ack 1, win 83, options [nop,nop,TS val 15443784 ecr 3927572866], length 170: HTTP: GET /_status HTTP/1.1

This works as expected, the only problem with that is that we aren't exactly replicating the same conditions of incoming traffic from the Internet. It's worth noting that this is only required with IPVS as fp. Katran won't experience this issue as locally originated traffic won't get balanced even if it's targeting the VIP

vgutierrez merged https://gitlab.wikimedia.org/repos/sre/liberica/-/merge_requests/53

cp: Avoid performing DNS queries on every healthcheck execution