Page MenuHomePhabricator

Make varnish-frontend-restart work on Beta Cluster
Open, LowPublic

Description

The script varnish-frontend-restart currently fails as follows in Beta:

root@deployment-cache-text06:~# varnish-frontend-restart 
ERROR:etcd.client:Could not discover the etcd hosts from conftool.deployment-prep.eqiad.wmflabs: None of DNS query names exist: _etcd._tcp.conftool.deployment-prep.eqiad.wmflabs., _etcd._tcp.conftool.deployment-prep.eqiad.wmflabs.deployment-prep.eqiad.wmflabs., _etcd._tcp.conftool.deployment-prep.eqiad.wmflabs.deployment-prep.eqiad1.wikimedia.cloud., _etcd._tcp.conftool.deployment-prep.eqiad.wmflabs.eqiad.wmflabs.
/usr/lib/python3/dist-packages/urllib3/connection.py:362: SubjectAltNameWarning: Certificate for deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/conftool/drivers/etcd.py", line 114, in _ls
    res = self.client.read(key, recursive=recursive)
  File "/usr/lib/python3/dist-packages/etcd/client.py", line 597, in read
    timeout=timeout)
  File "/usr/lib/python3/dist-packages/etcd/client.py", line 907, in wrapper
    return self._handle_server_response(response)
  File "/usr/lib/python3/dist-packages/etcd/client.py", line 987, in _handle_server_response
    etcd.EtcdError.handle(r)
  File "/usr/lib/python3/dist-packages/etcd/__init__.py", line 306, in handle
    raise exc(msg, payload)
etcd.EtcdKeyNotFound: Key not found : /conftool/v1/pools

Although actually depooling the service makes little sense in Beta (the node is not behind LVS), it would be good if the script could Just Work and effectively fall back to systemctl restart varnish-frontend to ensure uniformity of procedure when upgrading Varnish in Beta and in Prod.

Related Objects

StatusSubtypeAssignedTask
OpenNone
DeclinedNone
OpenNone
ResolvedNone
Resolved chasemp
ResolvedAndrew
DeclinedNone
DuplicateNone
Resolved chasemp
Resolved chasemp
Resolvedfaidon
Resolved chasemp
Resolved aborrero
Resolved aborrero
Resolved chasemp
Resolved chasemp
Resolved aborrero
ResolvedPapaul
Resolved chasemp
Resolved chasemp
Resolved chasemp
Resolved aborrero
Resolved aborrero
Resolved aborrero
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Resolvedfaidon
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved Cmjohnson
Resolved aborrero
Resolved Cmjohnson
Resolved aborrero
Resolved Cmjohnson
Resolved aborrero
Resolved Cmjohnson
ResolvedAndrew
Resolved chasemp
Resolved aborrero
Resolved aborrero
Resolved aborrero
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved aborrero
ResolvedAndrew
Resolved aborrero
InvalidAndrew
ResolvedAndrew
ResolvedNone
ResolvedAndrew
DeclinedNone
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Duplicate herron
Resolved herron
Resolved bd808
Resolved herron
Resolved JHedden
ResolvedKrenair
Resolved mmodell
Resolved bd808
ResolvedKrenair
Resolveddduvall
ResolvedAndrew
Resolvedhashar
Duplicate aborrero
Resolved aborrero
OpenNone
OpenNone

Event Timeline

I'm guessing the below has the same root cause, albeit on a deployment host, not a varnish host.

krinkle@deployment-deploy04:~$ sudo tail -n190 /var/log/syslog

Sep  4 17:10:11 deployment-deploy04 confd[728886]: 2025-09-04T17:10:11Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:14 deployment-deploy04 confd[728886]: 2025-09-04T17:10:14Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:14 deployment-deploy04 confd[728886]: 2025-09-04T17:10:14Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:14 deployment-deploy04 confd[728886]: 2025-09-04T17:10:14Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:17 deployment-deploy04 confd[728886]: 2025-09-04T17:10:17Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:17 deployment-deploy04 confd[728886]: 2025-09-04T17:10:17Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:17 deployment-deploy04 confd[728886]: 2025-09-04T17:10:17Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:20 deployment-deploy04 confd[728886]: 2025-09-04T17:10:20Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:20 deployment-deploy04 confd[728886]: 2025-09-04T17:10:20Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:20 deployment-deploy04 confd[728886]: 2025-09-04T17:10:20Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:23 deployment-deploy04 confd[728886]: 2025-09-04T17:10:23Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:23 deployment-deploy04 confd[728886]: 2025-09-04T17:10:23Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:23 deployment-deploy04 confd[728886]: 2025-09-04T17:10:23Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:26 deployment-deploy04 confd[728886]: 2025-09-04T17:10:26Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:26 deployment-deploy04 confd[728886]: 2025-09-04T17:10:26Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:26 deployment-deploy04 confd[728886]: 2025-09-04T17:10:26Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:29 deployment-deploy04 confd[728886]: 2025-09-04T17:10:29Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:29 deployment-deploy04 confd[728886]: 2025-09-04T17:10:29Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:29 deployment-deploy04 confd[728886]: 2025-09-04T17:10:29Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:32 deployment-deploy04 confd[728886]: 2025-09-04T17:10:32Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]
Sep  4 17:10:32 deployment-deploy04 confd[728886]: 2025-09-04T17:10:32Z deployment-deploy04 /usr/bin/confd[728886]: ERROR 100: Key not found (/conftool/v1/pools) [25]

I noticed this while reviewing syslog on the deployment host after the PHP 8.3 upgrade (Not attaching to T401855, since it is a pre-existing issue.)

This appears to be logged continously once very second, making up 99% of syslog in beta.