Page MenuHomePhabricator

"depool" installed on prometheus hosts but not "confctl"
Closed, ResolvedPublic

Description

I noticed this yesterday when rolling-restart prometheus hosts, when depool is called and there's no confctl then nothing is printed and no explicit error but exit code 127

root@prometheus2003:~# depool
Depooling all services on prometheus2003.codfw.wmnet
root@prometheus2003:~# echo $?
127
root@prometheus2003:~# which confctl
root@prometheus2003:~#

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 28 2020, 10:20 AM
Joe added a subscriber: Joe.May 28 2020, 10:42 AM

This happens because the prometheus role somehow includes conftool::scripts but not profile::conftool::client.

A quick survey of production says this only happens on the prometheus servers and on the restbase-dev servers, where this is kind-of expected:

$ sudo cumin 'P{C:conftool::scripts} and not P{C:profile::conftool::client}'
7 hosts will be targeted:
prometheus[2003-2004].codfw.wmnet,prometheus[1003-1004].eqiad.wmnet,restbase-dev[1004-1006].eqiad.wmnet

So I think the right fix is:

  • Fail from the scripts if the binary is not found
  • Add profile::conftool::client to role::prometheus, or switch it to use profile::lvs::realserver that requires it.

Change 599298 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: include ::profile::conftool::client

https://gerrit.wikimedia.org/r/599298

Change 599299 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] conftool: bail on confctl not found

https://gerrit.wikimedia.org/r/599299

Change 599299 merged by Filippo Giunchedi:
[operations/puppet@production] conftool: bail on confctl not found

https://gerrit.wikimedia.org/r/599299

Change 602334 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] conftool: fix confctl detection logic

https://gerrit.wikimedia.org/r/602334

Change 602334 merged by Filippo Giunchedi:
[operations/puppet@production] conftool: fix confctl detection logic

https://gerrit.wikimedia.org/r/602334

Change 599298 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: use profile::lvs::realserver

https://gerrit.wikimedia.org/r/599298

fgiunchedi closed this task as Resolved.Jun 9 2020, 3:28 PM
fgiunchedi claimed this task.

All done!