Page MenuHomePhabricator

Remove EL capsule from meta and add it to codebase
Closed, ResolvedPublic8 Estimated Story Points

Description

Remove EL capsule from meta and add it to codebase as it really cannot evolve on its own.

Ideally we would like for EL events to be like mediawiki events and not have capsules. While the EL code is decoupled from event schemas (as it works and processes events regardless of the schema definition) the capsule is closely tied to the code, capsule fields are processed and expected to appear and the capsule is not optional.

This is a bigger problem in MYSQL storage as capsule is tied closely to table structure.

Event Timeline

I don't quite understand the "cannot evolve on its own" argument; isn't that the case for any and all schema pages on Meta? (They are all tied to code, whether generic or instrumentation-specific.)

Users of EventLogging data do need an up to date documentation of these fields that is not hidden somewhere in the code base, just like we have for basically all other Analytics tables (on Wikitech or Meta).

Users of EventLogging data do need an up to date documentation of these fields

We don't even have this now, as recently we experienced. The eventlogging code base hard codes the EventCapsule schema revision, which means that the wiki will not be up to date until the code has been changed and deployed.

that is not hidden somewhere in the code base, just like we have for basically all other Analytics tables (on Wikitech or Meta)

For sure! We'd love to totally get rid of EventCapsule altogether. The fact that an individual event schema is not complete when you look it on meta is a problem. EventCapsule is not a table, and only operates and works within the very specific python eventlogging codebase. We don't have an API or GUI in which you get full schema of any event. You have to use eventlogging code directly, which knows how to glue together EventCapsule with other schemas.

I don't quite understand the "cannot evolve on its own" argument; isn't that the case for any and all schema pages on Meta? (They are all tied to code, whether generic or instrumentation-specific.)

The event schemas are tied to specific use cases, in that they are designed and instrumented together. Each of the event schemas are decoupled from each other. EventCapsule is different, in that all events use it, and all of eventlogging will break if the EventCapsule schema doesn't match up perfectly with the code. The EventCapsule schema is already fully hardcoded into the eventlogging tests.

We want to be sure we know what we are affecting when we deploy, and tightly coupling the eventlogging processing code to a specific editable remote piece of data is fragile.

For example, in order to deploy EventCapsule schema changes in beta for testing, we have to create a new revision of EventCapsule. blegh! :)

Per conversation in person: let's make sure that when/if we remove event capsule from meta (so it does not exists in schema form), when we do we document the de-facto capsule of eventlogging schemas in the same fashion we document hive tables, wikitech documentation.

Change 404303 had a related patch set uploaded (by Mforns; owner: Mforns):
[eventlogging@master] [WIP] Move EventCapsule into codebase

https://gerrit.wikimedia.org/r/404303

We still need to:

mforns set the point value for this task to 8.Jan 19 2018, 5:04 PM

Change 404303 merged by Ottomata:
[eventlogging@master] Move EventCapsule into codebase

https://gerrit.wikimedia.org/r/404303