Page MenuHomePhabricator

Create a wrapper around conftool for our pooling/depooling needs
ClosedPublic

Authored by demon on Mar 15 2017, 4:59 PM.

Details

Summary

Also go ahead and use it for what our original usecase was: depooling the proxies while we're using them. Should free them up just a tad and avoid some minor flapping we've seen before

See T104352
Fixes T125629

Test Plan

Totally untested. Oh, and it breaks everything for now because we don't have conftool installed in vagrant or on deploy masters. So probably shouldn't merge yet

Diff Detail

Repository
rMSCA Scap
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
demon retitled this revision from Conftool: Create a wrapper around conftool for our pooling/depooling needs to Create a wrapper around conftool for our pooling/depooling needs.Mar 15 2017, 7:39 PM
mmodell edited edge metadata.Mar 20 2017, 12:57 PM

Build error:

No matching distribution found for conftool (from -r requirements.txt (line 1))

demon added a comment.Mar 20 2017, 4:08 PM

Oh yeah I know. Hence my warning in the testplan ๐Ÿ™ƒ

Joe requested changes to this revision.May 11 2017, 6:01 AM
Joe added a subscriber: Joe.

I think the basic idea for the patch is good, I think the implementation can be improved as it is not currently doing what it's intended to do.

requirements.txt
3

Be aware that I never submitted conftool to Pypi, so installing via requirements will fail for now. I'll fix this situation ASAP

scap/conftool.py
9 โ†—(On Diff #1578)

I would suggest to call this file something different than "conftool" or import issues can continuously happen.

15 โ†—(On Diff #1578)

this is extremely simplistic, I don't think this is the right approach. More on this later.

16 โ†—(On Diff #1578)
  1. you need to depool all services, not just "apache2"
  2. you cannot assume all servers you're dealing with are appservers.
33 โ†—(On Diff #1578)

Do not declare node as a class variable, or it will be shared between all instances; this is *not* what you want to do.

43 โ†—(On Diff #1578)

Instead of accepting the whole kwargs here, you could do what we do in conftool.cli

Specifically, given a list of node names:

import re

node_re = re.compile('({names})'.format(names='|'.join(servers))
self.nodes = [n for n in node.Node.query({'name': node_re)]

This has several advantages, but the most important one is that it is guaranteed to fecth all services on a host, and depool them.

This revision now requires changes to proceed.May 11 2017, 6:01 AM

Thanks for the really thorough feedback. And yeah, I knew this probably wouldn't work and was totally untested. I'll get to your feedback hopefully next week.

demon marked 5 inline comments as done.Jun 1 2017, 6:35 PM
demon updated this revision to Diff 1759.Jun 1 2017, 6:35 PM
demon edited edge metadata.
  • Rewrote conftool as pooler, no naming conflicts
  • Now iterates all services, not just apache2 and appservers
  • Still a WIP, I wasn't entirely sure what the results of query() give me. Need advice on how completely wrong node_data is ;-)
demon updated this revision to Diff 1760.Jun 1 2017, 6:47 PM
  • I think this is more correct, using tags() to get all of the data. Still a little unclear on service list (do we really only return one, or a list?). Also fix state variable usage while here
demon added inline comments.Jun 1 2017, 6:49 PM
scap/pooler.py
51โ€“53

I'm not entirely sure if this is correct still. From reading node.Node()'s code (heh) I think I need to use tags() to get all of this data out of our result objects.

From the scap side of things the code looks fine. Looks like this will require some fiddling with pip and tox to get tests passing again :(

@Joe Any chance you could have a look at this again? We definitely need conftool in pip or this won't ever build.

Joe added a comment.Jun 28 2017, 6:19 AM
In D600#13637, @demon wrote:

@Joe Any chance you could have a look at this again? We definitely need conftool in pip or this won't ever build.

https://pypi.python.org/pypi/conftool is there for you :)

demon updated this revision to Diff 1812.Jun 28 2017, 4:26 PM
  • Fix import of KVObject
mmodell accepted this revision as: mmodell.Jun 28 2017, 5:22 PM
thcipriani requested changes to this revision.Jul 10 2017, 4:03 PM

Few things I found from futzing on tin.

scap/pooler.py
52

Each item in self.nodes is a node object. tags is a property of a node, not a function: https://github.com/wikimedia/operations-software-conftool/blob/master/conftool/kvobject.py#L255-L257

54

I don't see a tags['name'] in the list of tags

>>> from conftool.kvobject import KVObject                                                                                                                                
>>> from conftool import configuration, node
>>> import re                                         
>>> KVObject.setup(configuration.get('/etc/conftool/config.yaml'))                   
>>> node_re = re.compile('({name})'.format(name='|'.join(['mw1280.eqiad.wmnet', 'mw1211.eqiad.wmnet'])))                                                                                     
>>> nodes = [n for n in node.Node.query({'name': node_re})]                          
>>> nodes            
[<conftool.node.Node object at 0x7f2183267050>, <conftool.node.Node object at 0x7f2184b02850>, <conftool.node.Node object at 0x7f2183267350>, <conftool.node.Node object at 0x7f21832673d0>]                          
>>> nodes[0].tags    
{'cluster': u'api_appserver', 'dc': u'eqiad', 'service': u'apache2'}

the Node object does expect name to be passed in. It should be the last tag passed in. So something like:

box = node.Node(tags['dc'], tags['cluster'], service, node_data.name)

Should work.

55

I don't see the pooled property on any object up the chain of inheritance. @Joe is this the right way to do this?

Also, didn't try write() but the method call looks correct :)

This revision now requires changes to proceed.Jul 10 2017, 4:03 PM
thcipriani added inline comments.Jul 10 2017, 4:56 PM
scap/pooler.py
56

I wonder if instead of:

box.pooled = state
box.write()

We could do:

box.update({'pooled': state})

It looks like that will call write: https://github.com/wikimedia/operations-software-conftool/blob/master/conftool/kvobject.py#L114-L122

demon marked 2 inline comments as done.Jul 11 2017, 7:21 PM
demon updated this revision to Diff 1858.Jul 11 2017, 7:27 PM
demon edited edge metadata.
  • Tags are a property, not a method
  • Name is a property, not a tag
  • Use update() to set pooled state & write

Seems like we've tested all we can test without actually calling the .write() method anywhere, lgtm!

I have one minor thing inline, and then I think it's good to go.

scap/pooler.py
29

This will explode on beta since we don't currently have any proxies:

AttributeError: 'NoneType' object has no attribute 'driver'

May want to guard against the case where servers is empty somehow someway.

demon marked 4 inline comments as done.Jul 12 2017, 3:34 PM
demon updated this revision to Diff 1872.Jul 12 2017, 7:46 PM
  • Only do proxy behavior if we have proxies
demon added inline comments.Jul 12 2017, 7:51 PM
scap/pooler.py
29

Wrapped everything in the caller with len() checks. Should probably also validate here and raise an exception for future callers

Joe added a comment.Jul 13 2017, 2:41 PM

I think you need to fix the logic inside the set_pooled_state function as per my comment. Apart from that, it seems correct.

scap/pooler.py
51

here you can simply do as follows:

try:
   node_data.update({'pooled': state})
except conftool.drivers.BackendError:
   # manage conftool errors however you want, re-raise them? keep a state machine?

as your previous selection in the Pooler already selected all nodes with the different services, and you don't need to explore anymore.

So I would make this a method of the class, and simplify it.

demon added inline comments.Jul 13 2017, 7:34 PM
scap/pooler.py
51

This all makes total sense. In my earlier iteration, I wasn't aware I was returning a Node object, I thought I was returning a dictionary that I needed to construct a Node from. You're right, since we've already got the node on hand we can just move this into the class and operate on the nodes themselves.

demon updated this revision to Diff 1900.Jul 14 2017, 8:22 PM
  • Reuse existing node objects rather than constructing new ones
  • Catch failures, and return the success/failure scenario to callers, it's on them to resolve as desired
demon updated this revision to Diff 1901.Jul 14 2017, 8:28 PM
  • Fix line too long error
thcipriani accepted this revision.Jul 14 2017, 8:43 PM
thcipriani added inline comments.
scap/main.py
152

Might be good to log the output of depool() here. Dunno if we want to have an actual warning and freak people out with red text or anything, but maybe adding it to debug so we could find it in logstash would be nice.

demon marked 4 inline comments as done.Jul 14 2017, 8:46 PM
demon added inline comments.
scap/main.py
152

Could easily log at the INFO level. It's worth including to the terminal. It's not bad, so we shouldn't scare people.

demon updated this revision to Diff 1902.Jul 14 2017, 8:51 PM
  • Make path configuration file for conftool configurable (also lets us disable the functionality)
demon marked 2 inline comments as done.Jul 18 2017, 5:33 PM
demon added inline comments.
scap/main.py
152

Per IRC, going to do this extra logging as a follow-up

demon marked 2 inline comments as done.Jul 18 2017, 5:33 PM
Joe accepted this revision.Jul 19 2017, 2:13 PM

As far as the conftool part is concerned, this seems correct.

Only one doubt about the runtime: scap is supposed to be ran as a normal user, not sure how easy it will be to set up credentials for conftool in that case, I guess we'll have to modify conftool's etcd driver a bit for that to work in production.

This revision is now accepted and ready to land.Jul 19 2017, 2:13 PM
This revision was automatically updated to reflect the committed changes.