Currently wdqs is defined directly in varnish as a pool of randomized backend hostnames. There should be a real service hostname for the internal service like wdqs.svc.eqiad.wmnet, which is defined in LVS with pybal controlling the pooling of the 3x backends, and then varnish's configuration should be updated to use that hostname rather than enumerating the backends directly. This involves a handful of complex puppet changes and LVS new-service deploys are always a bit "special" (requiring careful manual restarts).
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | BBlack | T147844 Standardize varnish applayer backend definitions | |||
Resolved | BBlack | T132457 Move wdqs to an LVS service |
Event Timeline
Change 312224 had a related patch set uploaded (by Gehel):
wdqs - add icinga check for LVS services
Change 312225 had a related patch set uploaded (by Gehel):
wdqs - configure varnish to use LVS service as backend
Mentioned in SAL (#wikimedia-operations) [2016-10-05T13:46:18Z] <gehel> deploying new LVS configuration for WDQS service - T132457
Mentioned in SAL (#wikimedia-operations) [2016-10-05T14:11:49Z] <gehel> restarting pybal on lvs1006 - T132457
Mentioned in SAL (#wikimedia-operations) [2016-10-05T14:15:00Z] <gehel> restarting pybal on lvs1009 - T132457
Mentioned in SAL (#wikimedia-operations) [2016-10-05T14:17:31Z] <gehel> restarting pybal on lvs1012 - T132457
Mentioned in SAL (#wikimedia-operations) [2016-10-05T14:18:53Z] <gehel> restarting pybal on lvs1003 - T132457
Mentioned in SAL (#wikimedia-operations) [2016-10-05T14:22:27Z] <gehel> restarting pybal on lvs1006 - T132457
Mentioned in SAL (#wikimedia-operations) [2016-10-05T14:27:40Z] <gehel> restarting pybal on lvs1003 - T132457
Change 312225 merged by BBlack:
wdqs - configure varnish to use LVS service as backend
Change 315528 had a related patch set uploaded (by BBlack):
cache_misc: removed wdqs probe ref T132457