Page MenuHomePhabricator

etcd/confd is not started on beta cluster Varnish caches
Closed, ResolvedPublic

Description

confd systemd unit refuses to start on deployment-cache-mobile04 hat is probably the same situation on the other Varnish caches. Because:

FATAL Cannot get nodes from SRV records lookup _etcd._tcp.deployment-prep.eqiad.wmflabs: no such host

Spotted in logstash-beta.wmflabs.org ,

From the instance logs:

$ journalctl |cat|tail -n5
Oct 21 20:20:48 deployment-cache-mobile04
systemd[1]: Started confd.
confd[1080]: 2015-10-21T20:20:48Z deployment-cache-mobile04 /usr/bin/confd[1080]: INFO SRV domain set to deployment-prep.eqiad.wmflabs
confd[1080]: 2015-10-21T20:20:48Z deployment-cache-mobile04 /usr/bin/confd[1080]: FATAL Cannot get nodes from SRV records lookup _etcd._tcp.deployment-prep.eqiad.wmflabs: no such host
systemd[1]: confd.service: main process exited, code=exited, status=1/FAILURE
systemd[1]: Unit confd.service entered failed state.

We would need it to prepare the production deployment and test out scap3 / confd integration.

Event Timeline

hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)
hashar added subscribers: hashar, Joe, thcipriani.
Dzahn triaged this task as Medium priority.Oct 22 2015, 10:28 PM
Dzahn subscribed.
deployment-cache-text04:/etc# cat /lib/systemd/system/confd.service
[Unit]
Description=confd

[Service]
User=root
Environment="CONFD_BACKEND=etcd"
Environment="CONFD_DISCOVERY=-srv-domain deployment-prep.eqiad.wmflabs -scheme https"
Environment="CONFD_OPTS=-watch"
ExecStart=/usr/bin/confd -backend $CONFD_BACKEND $CONFD_DISCOVERY $CONFD_OPTS
Restart=on-failure
RestartSec=10s

Seems the CONFD_DISCOVERY should be set differently and probably hardcoded to point to deployment-conf03.

Reopening. conftool refuses to start on any of the beta cluster Varnish caches. Ex: deployment-cache-text04

hashar renamed this task from etcd/confd is not started on deployment-cache-mobile04 to etcd/confd is not started on beta cluster Varnish caches.Apr 8 2016, 11:48 AM