Page MenuHomePhabricator

Many objects in conftool have pooled=yes, weight=0
Closed, ResolvedPublic

Description

P10456 shows the list is quite long.

This happens because people just run pool instead of setting pooled status and weight appropriately using conftool::scripts::initialize or the confctl command line.
Please note this configuration makes sense in abstract, but in practice is not supported by pybal, per T86650.

Things we can do now:

  • Have the pool script refuse to pool a service if the weight is 0
  • Fix the current entries, remove the unnecessary ones

What we might want to do in on longer-term:

  • Add a system of constraints to confctl that would allow catching the error there.

Event Timeline

Joe created this task.Feb 19 2020, 10:12 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 19 2020, 10:12 AM
Joe claimed this task.Feb 19 2020, 10:12 AM
Joe triaged this task as High priority.
elukey added a subscriber: elukey.EditedFeb 19 2020, 10:22 AM

One question - some aqs nodes are listed in P10456, but I don't see them reported in https://config-master.wikimedia.org/pybal/eqiad/aqs with weight=0. I usually double check in config-master.wikimedia.org when depooling/pooling, I guess that I shouldn't trust it as source of truth?

Nevermind, the issue seems to be different:

elukey@puppetmaster1001:~$ sudo confctl select name=aqs1004.eqiad.wmnet get
{"aqs1004.eqiad.wmnet": {"weight": 10, "pooled": "yes"}, "tags": "dc=eqiad,cluster=aqs,service=aqs"}
{"aqs1004.eqiad.wmnet": {"weight": 0, "pooled": "yes"}, "tags": "dc=eqiad,cluster=aqs,service=cassandra"}

Will investigate :)

Change 573289 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] conftool: remove useless cassandra service pool

https://gerrit.wikimedia.org/r/573289

Change 573290 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] conftool: remove useless cassandra service pool

https://gerrit.wikimedia.org/r/573290

Change 573291 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] conftool::scripts: remove compatibility, disable draining

https://gerrit.wikimedia.org/r/573291

Change 573292 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] conftool::scripts: refuse to pool a server if the weight is 0

https://gerrit.wikimedia.org/r/573292

Change 573289 merged by Giuseppe Lavagetto:
[operations/puppet@production] conftool: remove useless cassandra service pool

https://gerrit.wikimedia.org/r/573289

Change 573290 merged by Giuseppe Lavagetto:
[operations/puppet@production] conftool: remove useless cassandra service pool

https://gerrit.wikimedia.org/r/573290

Change 573291 merged by Giuseppe Lavagetto:
[operations/puppet@production] conftool::scripts: remove compatibility, disable draining

https://gerrit.wikimedia.org/r/573291

Change 573292 merged by Giuseppe Lavagetto:
[operations/puppet@production] conftool::scripts: refuse to pool a server if the weight is 0

https://gerrit.wikimedia.org/r/573292

Joe updated the task description. (Show Details)Feb 27 2020, 12:07 PM

Mentioned in SAL (#wikimedia-operations) [2020-06-04T05:59:59Z] <_joe_> fixing weights of cp2040 T245594

Joe closed this task as Resolved.Jun 4 2020, 7:34 AM

Resolving this as we have no more services with weight 0, and now "pool" should correctly refuse to pool a service if the weight is zero

Joe updated the task description. (Show Details)Jun 4 2020, 7:35 AM