Page MenuHomePhabricator

Add custom HAProxy backend only for healthchecks
Closed, ResolvedPublic

Description

Currently the L4 LBs sends healtchecks for Varnish through HAProxy "transparently", meaning that HAProxy routes these requests along with the other (non-hc) ones to Varnish.

This could be problematic during high traffic (legitimate or not) as an eventual limit on maximum number of connections towards Varnish will impact the healthcheck requests too, with obvious consequences.

A solution is to use another, dedicated backend on HAProxy just for healthchecks, with a custom ACL to differentiate the traffic.

Implementation notes:

Currently healthchecks sent by PyBal have the following characteristics that should be matched by the HAProxy ACL:

  • Host: healthcheck.wikimedia.org
  • Url: /varnish-fe
  • Source IPv4/IPv6 addr: [depends on the DC]

The ACL and dedicated backend in the HAProxy configuration should be surrounded by a hiera switch to allow a smoother and safer deployment across all cp hosts.

The list of the source IP addresses (L4 LBs) should go into a separate list file for ACL readability and to easily differentiate them per DC.

The change has been currently deployed to:

  • ulsfo
  • eqsin
  • codfw
  • drms
  • eqiad
  • esams

Event Timeline

Vgutierrez moved this task from Backlog to Scheduled incidental work on the Traffic board.

Change 966221 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: start working on healthcheck-dedicated backend

https://gerrit.wikimedia.org/r/966221

Change 966221 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend

https://gerrit.wikimedia.org/r/966221

The change is been deployed on cp4037.ulsfo.wmnet as test host, PyBal healthchecks failures will be monitored.

Mentioned in SAL (#wikimedia-operations) [2023-11-02T15:40:10Z] <fabfur> cp4037 repooling with changes for dedicated healthcheck backend (haproxy): https://gerrit.wikimedia.org/r/c/operations/puppet/+/966221/ (T348851)

Change 971228 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in ulsfo

https://gerrit.wikimedia.org/r/971228

Change 971228 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in ulsfo

https://gerrit.wikimedia.org/r/971228

Mentioned in SAL (#wikimedia-operations) [2023-11-02T16:26:14Z] <fabfur> haproxy: this change https://gerrit.wikimedia.org/r/c/operations/puppet/+/971228 will be propagated soon to all cp-ulsfo hosts (T348851)

the change has been deployed to all ulsfo cp hosts

Change 971907 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in eqsin

https://gerrit.wikimedia.org/r/971907

Change 971907 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in eqsin

https://gerrit.wikimedia.org/r/971907

Change 971922 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in codfw

https://gerrit.wikimedia.org/r/971922

Change 971922 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in codfw

https://gerrit.wikimedia.org/r/971922

Change 971966 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in drmrs

https://gerrit.wikimedia.org/r/971966

Change 971966 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in drmrs

https://gerrit.wikimedia.org/r/971966

Change 972320 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in eqiad

https://gerrit.wikimedia.org/r/972320

Change 972320 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in eqiad

https://gerrit.wikimedia.org/r/972320

Change 972336 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in esams

https://gerrit.wikimedia.org/r/972336

Change 972336 merged by Fabfur:

[operations/puppet@production] haproxy: enable healthcheck-dedicated backend in esams

https://gerrit.wikimedia.org/r/972336

Fabfur updated the task description. (Show Details)