Page MenuHomePhabricator

Use scap3 to deploy eventlogging/eventlogging
Closed, DeclinedPublic21 Estimated Story Points

Description

Currently deployed with trebuchet. Shouldn't be too hard from IRC discussion, push some code, restart some services (currently upstart).

Event Timeline

demon raised the priority of this task from to Medium.
demon updated the task description. (Show Details)
demon added subscribers: demon, Ottomata.

eventlogging-service is using systemd, and we want to port all of eventlogging to Jessie and systemd sometime in the not too distant future.

It would be best to let us fix the upstart vs systemd stuff first. Currently eventlogging deployment takes a manually global sudo pip install step, which is less than ideal.

I think we'll probably need an eventlogging/deploy repository for this.

Change 280471 had a related patch set uploaded (by Ottomata):
Factory out install_requires from setup.py into requirements.txt

https://gerrit.wikimedia.org/r/280471

Ottomata renamed this task from Move EventLogging service to scap3 to Move EventLogging to scap3.Mar 31 2016, 3:27 PM
Ottomata renamed this task from Move EventLogging to scap3 to Use scap3 to deploy eventlogging/eventlogging.
Ottomata claimed this task.
Ottomata added a project: Analytics-Kanban.
Ottomata moved this task from Next Up to In Progress on the Analytics-Kanban board.
Ottomata moved this task from In Progress to Next Up on the Analytics-Kanban board.

Change 280471 merged by Ottomata:
Factor out install_requires from setup.py into requirements.txt

https://gerrit.wikimedia.org/r/280471

Change 280730 had a related patch set uploaded (by Ottomata):
Create eventlogging::deployment::target define that abstracts scap::target for eventlogging targets

https://gerrit.wikimedia.org/r/280730

Change 280771 had a related patch set uploaded (by Ottomata):
[WIP] Add new scap::source define to ease bootstrapping of repositories on deploy servers

https://gerrit.wikimedia.org/r/280771

Change 280771 abandoned by Ottomata:
[WIP] Add new scap::source define to ease bootstrapping of repositories on deploy servers

Reason:
I squashed this with a previous commit:

Abandoning.
https://gerrit.wikimedia.org/r/#/c/280730/

https://gerrit.wikimedia.org/r/280771

Change 280730 merged by Ottomata:
Add new scap::source define to ease bootstrapping of repositories on deploy servers

https://gerrit.wikimedia.org/r/280730

Change 282166 had a related patch set uploaded (by Ottomata):
Create new eventlogging::analytics role in modules/role, use scap for deployment

https://gerrit.wikimedia.org/r/282166

Change 282166 merged by Ottomata:
Create new eventlogging::analytics role in modules/role, use scap for deployment

https://gerrit.wikimedia.org/r/282166

Nuria set the point value for this task to 21.Apr 7 2016, 4:56 PM

Change 282220 had a related patch set uploaded (by Ottomata):
eventlogging::service::* classes now depend but don't include eventlogging::server

https://gerrit.wikimedia.org/r/282220

Change 282220 merged by Ottomata:
eventlogging::service::* classes now depend but don't include eventlogging::server

https://gerrit.wikimedia.org/r/282220

Change 282224 had a related patch set uploaded (by Ottomata):
Remove notify from eventlogging::service:* classes to eventlogging/init Service.

https://gerrit.wikimedia.org/r/282224

Change 282224 merged by Ottomata:
Remove notify from eventlogging::service:* classes to eventlogging/init Service.

https://gerrit.wikimedia.org/r/282224

Change 282225 had a related patch set uploaded (by Ottomata):
Run eventlogging analytics daemons out of the scap deployment

https://gerrit.wikimedia.org/r/282225

Change 282225 merged by Ottomata:
Run eventlogging analytics daemons out of the scap deployment

https://gerrit.wikimedia.org/r/282225

DONE! (Well mostly)

The Analytics eventlogging role is now deployed to eventlog1001 using scap. eventbus is as well.

There are misc users of the eventlogging codebase that still rely on the trebuchet deployed codebase.

There are misc users of the eventlogging codebase that still rely on the trebuchet deployed codebase.

I don't know what that means, can you help me figure it out? :)

Yes, it is mostly performance team for the deployment to hafnium. See T131977

This comment was removed by Ottomata.
Milimetric moved this task from Incoming to Event Platform on the Analytics board.
Milimetric moved this task from Event Platform to Radar on the Analytics board.

First create an eventlogging/scap/webperf repository in gerrit, and fill it with scap config information. Check out eventlogging/scap/analytics as an example.

Then, add an entry to deployment/server.yaml for eventlogging webperf. Something like:

eventlogging/webperf:
  repository: eventlogging
  scap_repository: eventlogging/scap/webperf

Finally, you'll want to use the puppet define [[ https://github.com/wikimedia/puppet/blob/production/modules/eventlogging/manifests/deployment/target.pp | eventlogging::deployment::target ]] on hafnium (or wherever you want your instance of eventlogging code deployed):

eventlogging::deployment::target { 'webperf': }

That should be it! Run puppet on tin and then on hafnium (or wherever else). To deploy changes, do the scap3 dance on tin in /srv/deployment/eventlogging/webperf.

[..] create an eventlogging/scap/webperf repository [..]

Creating a dedicated deployment target for webperf would we can also use Scap to automatically restarts Webperf services (Yay!). However, does this also mean we should deploy updates to eventlogging ourselves? I wouldn't particularly mind (and might also help to avoid unexpected issues when we're not around), but it does come a bit as a surprise to find out this way.

In T131977 we migrated from an unmaintained global install on Hafnium to a Trebuchet deployment from Tin where it would automatically get updates whenever Analytics deploys updates to EventLogging in general.

It seems that sometime between April and October 2016, EventLogging changed from Trebuchet to Scap. I vaguely recall hearing about it, but I didn't expect it to implicitly end support for Webperf (again). It seems Hafnium now once again runs an outdated version of EventLogging.

I see the benefit of having separate scap configs (which Analytics and EventBus do already), automated restarts, and us taking ownership of the install by scheduling and performing deployments ourselves. It just came somewhat unexpected that the install is again outdated (and thus holding back T110903).

However, does this also mean we should deploy updates to eventlogging ourselves?

Yup!

might also help to avoid unexpected issues

Exactly.

It just came somewhat unexpected that the install is again outdated (and thus holding back T110903).

Hm, this most likely got lost in non-phabricator-communcation, or I am misremembering, but when we talked long ago, I thought you knew this was the next step. There were two steps: T131977 was to stop using the global install, and then this T118772 was to start using scap instead of trebuchet. Getting deployments off trebuchet has been a releng project for a while.

Anyway, sorry this got lost, probably was my fault. I had assumed you knew. If you don't want this to hold back T110903, you can probably still deploy eventlogging/eventlogging with trebuchet. It might also deploy to eventlog1001, but /srv/deployment/eventlogging/eventlogging is not used there.

Discussed this today. Given we're only using the eventlogging library as simple abstraction around python-kafka, it probably doesn't make sense to set up as a "git-deployed service" from the deployment hosts with Scap3 and all. Especially because, as long as it separates the code in Puppet from this dependency, it will continue to be hard to maintain. Perhaps it would make more sense for us to just inline the relevant abstraction instead of pulling in the entire library just for a simple connect call.

@Ottomata I know I've expressed preference for using eventlogging.connect() and I think you secretly think that's silly. I'm starting to agree. What does this abstraction offer beside setting up the python-kafka stream? E.g. if we follow the way statsv does it, would we be missing anything as far as you know?

Not really. You get a little bit of niceness in EventLogging because each event is an Event instance (a subclass of dict), rather than just a dict, which abstracts some schema specific details. But, since you are only dealing with a couple of schemas of which you know the format, using EventLogging for this doesn't buy you much more than just consuming with a Kafka client.