Only check logstash for canaries in the active datacenter
Needs ReviewPublic

Authored by Joe on Oct 2 2018, 1:06 PM.

Details

Reviewers
mmodell
Group Reviewers
Restricted Owners Package(Owns No Changed Paths)
Release-Engineering-Team
Patch without arc
git checkout -b D1117 && curl -L https://phabricator.wikimedia.org/D1117?download=true | git apply
Summary

While the endpoint checks should work in all datacenters, only the
current master receives real-world traffic, so we should only check the
logstash hosts in that datacenter.

To this end:

  • (partially) refactored the pooler submodule, and renamed it confctl
  • Added a confctl.setup_conftool function to standardize the way conftool is setup, and also load its schema.
  • Added a confctl.get_master_dc_nodes function that allows filtering a conftool query only including nodes in the current master datacenter
  • Modified the canary checks to filter the logstash checks via the aforementioned function

The coupling to conftool availability (which could be concerning in
itself) is very loose: if a proper response can't be obtained, the
unfiltered list will be used and the user will be notified with a warning.

Bug: T204907

Diff Detail

Joe created this revision.Oct 2 2018, 1:06 PM
Restricted Application added a reviewer: Restricted Owners Package.Oct 2 2018, 1:06 PM
Restricted Application added a reviewer: mmodell. · View Herald TranscriptOct 2 2018, 1:06 PM
Restricted Application added a reviewer: Release-Engineering-Team. · View Herald Transcript
Restricted Application added a project: Release-Engineering-Team. · View Herald Transcript
Joe requested review of this revision.Oct 2 2018, 1:08 PM

Concept looks good to me. A few problems and questions inline.

scap/confctl.py
33

should this be a list? Or a dict using the selector keys as the keys?

If it should be a dict the lambda inside the map below should be:

dict(map(lambda x: (x, re.compile('^{}$'.format(selectors[x]))), selectors))
36

FWIW, fails in beta with ValueError: /conftool/v1/pools is not a directory

scap/main.py
72

I don't see the _confctl_setup method -> self._conftool_setup().

104

This has never been true before and I'm not certain the code in this if works. Will the user's permissions be checked and rejected by conftool? Other things to worry about?

152

See questions about pooling line 103

386

Is this logic correct? It used to be max(len(canaries)/4, 1). Now it seems to be the maximum of the count of canaries or 1.

388

This comment may no longer be accurate depending on line 386

Joe marked an inline comment as done.Oct 3 2018, 5:13 PM
Joe added inline comments.
scap/confctl.py
33

yes, obviously, I must've deleted it when reformatting.

36

This means that etcd is misconfigured there. Nothing to do with the code. FWIW I think in beta we should avoid using etcd/conftool.

scap/main.py
72

yes, sigh. Thanks for spotting this. I really wrote careless code yesterday.

104

Oh well, funny :D I didn't realize it.

On one hand, it's true that if conftool_conf is False, then confctl_schema is None, but on the other hand the whole point of my change is to make it possible to read values from etcd in scap.

I'd suggest we move this part to a function we leave uncalled, and appropriately marked as such? It's two chunks of code we could do without. in the main() body.

If you agree, I will do as much in this patchset.

386

I dropped the /4 for no reason, yes it should still be there. Also it should be "servers" not "canaries"

thcipriani added inline comments.Oct 4 2018, 4:38 PM
scap/main.py
104

I'd suggest we move this part to a function we leave uncalled, and appropriately marked as such? It's two chunks of code we could do without. in the main() body.

If you agree, I will do as much in this patchset.

Seems fine to me, or we could drop the code entirely, if we want to bring it back we have git history.

is this ready to merge? I'm cutting 3.8.8 without this patch unless someone shouts at me soon.

is this ready to merge? I'm cutting 3.8.8 without this patch unless someone shouts at me soon.

Still work to be done on this patch, we can go ahead with the next release without this for now.