Page MenuHomePhabricator

Decommission EventLogging backend components by migrating to MEP
Open, MediumPublic

Description

As discussed in T228175: Event Platform Client Libraries, we believe we can migrate existent EventLogging extension produced streams to Modern Event Platform components. This will finally allow us to decommission the EventLogging backend pieces:

To support existent EventLogging events in eventgate-analytics, we need to do:

  • meta.wikimedia.org schemas ported to draft 7 JSONSchema in a git schema repo with common schema included via $ref.
  • stream config entry for each (active) EventLogging schema/stream.
  • Schema revision extension attributes changed to use the new semver schema version.
  • EL client side code adapted to produce full event (with capsule fields) and to POST to eventgate.
  • Resolve capsule userAgent type issues (This is a string in JSONSchema, and a struct in Hive)

Ideally, EventLogging will produce the full event including EventCapsule fields to eventgate-analytics-external, the same eventgate instance that new style schemas will use. The same Refine job we use for eventgate analytics events should be able to Refine the old EL style events. Not all fields from capsule will be set (e.g. seqId and recvFrom), but we can work with what we have on the client side. The main issue will be resolving the userAgent type discrepancy, as we will parse the user_agent during refinement.

We'll start by migrating a single high volume EventLogging stream to MEP: SearchSatisfaction - T249261: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform

Details

ProjectBranchLines +/-Subject
operations/mediawiki-configmaster+0 -1
mediawiki/extensions/TemplateWizardmaster+1 -1
operations/puppetproduction+107 -47
operations/mediawiki-configmaster+3 -0
schemas/event/secondarymaster+138 -0
operations/mediawiki-configmaster+1 -1
operations/mediawiki-configmaster+8 -2
operations/mediawiki-configmaster+1 -2
operations/puppetproduction+2 -2
analytics/refinery/sourcemaster+81 -4
analytics/refinery/sourcemaster+280 -156
analytics/refinery/sourcemaster+273 -152
operations/mediawiki-configmaster+42 -2
mediawiki/extensions/EventLoggingmaster+1 -1
operations/mediawiki-configmaster+2 -2
operations/deployment-chartsmaster+9 -3
operations/mediawiki-configmaster+6 -1
operations/mediawiki-configmaster+13 -3
operations/mediawiki-configmaster+22 -0
mediawiki/extensions/WikimediaEventsmaster+2 -1
schemas/event/secondarymaster+327 -1 K
operations/puppetproduction+1 -77
operations/puppetproduction+129 -92
operations/puppetproduction+2 -73
operations/puppetproduction+99 -15
analytics/refinery/sourcemaster+6 -1
analytics/refinery/sourcemaster+559 -148
operations/mediawiki-configmaster+4 -1
operations/mediawiki-configmaster+10 -7
operations/deployment-chartsmaster+3 -3
schemas/event/secondarymaster+231 -173
eventgate-wikimediamaster+25 -3
mediawiki/extensions/EventLoggingmaster+142 -36
analytics/refinery/sourcemaster+187 -7
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
Openjlinehan
OpenNone
OpenOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
DuplicateOttomata
ResolvedOttomata
Resolvedmforns
ResolvedGehel
ResolvedMilimetric
ResolvedOttomata
OpenOttomata
OpenOttomata
DeclinedOttomata
ResolvedOttomata
OpenOttomata
ResolvedGilles
Openmforns
OpenNone
DeclinedOttomata
ResolvedOttomata
Resolvedmforns
ResolvedMholloway
ResolvedOttomata
DuplicateNone
DuplicateNone
DuplicateNone
DuplicateNone
ResolvedMholloway
DuplicateNone
ResolvedOttomata
OpenSBisson
OpenSBisson
OpenSBisson
Resolvedmforns
ResolvedOttomata
Resolvedmforns
ResolvedOttomata
ResolvedOttomata
DeclinedNone
DeclinedNone
Openbmansurov
ResolvedJAllemandou
Resolvedmforns
ResolvedOttomata
ResolvedOttomata
OpenOttomata
DuplicateNone
ResolvedOttomata
OpenMNeisler
OpenSBisson
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
Openjlinehan
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
OpenNone
Openjlinehan
Openjlinehan
Openjlinehan
Resolvedmpopov
Resolvedjlinehan
Resolvedjlinehan
OpenNone
Resolvedmpopov
Declinedmpopov
Declinedmpopov
DeclinedSNowick_WMF
ResolvedSNowick_WMF
Declinednettrom_WMF
Resolvednettrom_WMF
Declinednshahquinn-wmf
Resolvednshahquinn-wmf
DeclinedMNeisler
Declinedjwang
Resolvedkzimmerman
Resolvedmpopov
Resolvedmpopov
ResolvedTsevener
ResolvedTsevener
Resolvedmpopov
ResolvedBUG REPORTMholloway
DuplicateNone
Resolvedmpopov
ResolvedSNowick_WMF
Resolvedjlinehan
Resolvedmpopov
ResolvedOttomata
Resolvedmforns
Resolvedmpopov
Resolvedmforns
OpenNone
Openmpopov
DeclinedNone
OpenNone
OpenNone
OpenNone
ResolvedMholloway
OpenNone
Resolvedjlinehan
Resolvedjlinehan
Resolvedmpopov
Resolvedsdkim
ResolvedGilles
DeclinedOttomata
OpenNone
OpenNone
ResolvedOttomata

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 589093 merged by Ottomata:
[eventgate-wikimedia@master] Never topic prefix legacy eventlogging_.* streams

https://gerrit.wikimedia.org/r/589093

Change 589074 merged by Ottomata:
[schemas/event/secondary@master] Preserve camelCase capitalization in analytics/legacy schemas

https://gerrit.wikimedia.org/r/589074

Change 592664 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics-external - Support backwards compatible eventlogging_ topic prefixing

https://gerrit.wikimedia.org/r/592664

Change 592664 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics-external - Support backwards compatible eventlogging_ topic prefixing

https://gerrit.wikimedia.org/r/592664

Change 586447 merged by jenkins-bot:
[analytics/refinery/source@master] Unify Refine transform functions and add user agent parser transform

https://gerrit.wikimedia.org/r/586447

Change 592726 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] wgEventStreams - Add SearchSatisfaction stream config and remove beta specific overrides

https://gerrit.wikimedia.org/r/592726

Change 592726 merged by Ottomata:
[operations/mediawiki-config@master] wgEventStreams - Add SearchSatisfaction stream config and remove beta specific overrides

https://gerrit.wikimedia.org/r/592726

Change 592735 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] wgEventStreams - properly prefix legacy eventlogging analytics stream names with eventlogging_

https://gerrit.wikimedia.org/r/592735

Change 592739 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] RefineTarget shouldRefine should consider both table whitelist and blacklist

https://gerrit.wikimedia.org/r/592739

Change 592735 merged by Ottomata:
[operations/mediawiki-config@master] wgEventStreams - properly prefix legacy eventlogging analytics stream names with eventlogging_

https://gerrit.wikimedia.org/r/592735

Change 592756 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] refine.pp - Slight refactor to use new unified refine tranform functions

https://gerrit.wikimedia.org/r/592756

Change 592739 merged by jenkins-bot:
[analytics/refinery/source@master] RefineTarget shouldRefine should consider both table whitelist and blacklist

https://gerrit.wikimedia.org/r/592739

Change 592756 merged by Ottomata:
[operations/puppet@production] refine.pp - Slight refactor to use new unified refine tranform functions

https://gerrit.wikimedia.org/r/592756

Change 593573 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Fix refine_event table_blacklist_regex and remove absented mediawiki_events refine job

https://gerrit.wikimedia.org/r/593573

Change 593573 merged by Ottomata:
[operations/puppet@production] Refine - fix table_blacklist_regex and remove mediawiki_events refine job

https://gerrit.wikimedia.org/r/593573

Change 593594 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Factor out RefineFailuresChecker into the refine_job define

https://gerrit.wikimedia.org/r/593594

Change 593594 merged by Ottomata:
[operations/puppet@production] Factor out RefineFailuresChecker into the refine_job define

https://gerrit.wikimedia.org/r/593594

Change 593605 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Remove absented failed_flags_ refine::jobs

https://gerrit.wikimedia.org/r/593605

Change 593605 merged by Ottomata:
[operations/puppet@production] Remove absented failed_flags_ refine::jobs

https://gerrit.wikimedia.org/r/593605

Change 593610 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] [WIP] Add eventlogging_legacy job to refine EventLogging events from EventGate

https://gerrit.wikimedia.org/r/593610

Change 594981 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/secondary@master] Add EventLogging legacy Test schema

https://gerrit.wikimedia.org/r/594981

Change 594981 merged by Ottomata:
[schemas/event/secondary@master] Add EventLogging legacy Test schema

https://gerrit.wikimedia.org/r/594981

Change 595025 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Add eventlogging_Test to wgEventStreams config

https://gerrit.wikimedia.org/r/595025

Change 595027 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[mediawiki/extensions/WikimediaEvents@master] Configure Test event stream to be sent via EventGate

https://gerrit.wikimedia.org/r/595027

Change 595025 merged by Ottomata:
[operations/mediawiki-config@master] Add eventlogging_Test to wgEventStreams config

https://gerrit.wikimedia.org/r/595025

Change 595027 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Configure Test event stream to be sent via EventGate

https://gerrit.wikimedia.org/r/595027

Change 595032 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Set wgEventLoggingStreamNames with initial streams EventLogging is allowed to produce

https://gerrit.wikimedia.org/r/595032

Change 595032 merged by jenkins-bot:
[operations/mediawiki-config@master] Set wgEventLoggingStreamNames with initial streams EventLogging is allowed to produce

https://gerrit.wikimedia.org/r/595032

Mentioned in SAL (#wikimedia-operations) [2020-05-07T20:10:07Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingStreamNames: set initial stream names, as yet unused - T238230 (duration: 01m 07s)

Change 595047 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Set wgEventLoggingServiceUri in beta and production on group0 wikis

https://gerrit.wikimedia.org/r/595047

Change 595047 merged by Ottomata:
[operations/mediawiki-config@master] Set wgEventLoggingServiceUri in beta and production on group0 wikis

https://gerrit.wikimedia.org/r/595047

Woohoo! I just logged a Test event to eventgate in beta via mw.eventLog.logEvent("Test", {"OtherMessage": "test"}and mw.track("event.Test", {"OtherMessage": "test"})! Both work just right!

Change 595634 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Configure wgEventLoggingSchemas overrides in beta and testwiki

https://gerrit.wikimedia.org/r/595634

Change 595634 merged by Ottomata:
[operations/mediawiki-config@master] Configure wgEventLoggingSchemas overrides in beta and testwiki

https://gerrit.wikimedia.org/r/595634

Change 595969 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate - Set NODE_EXTRA_CA_CERTS

https://gerrit.wikimedia.org/r/595969

Change 595969 merged by Ottomata:
[operations/deployment-charts@master] eventgate - Set NODE_EXTRA_CA_CERTS

https://gerrit.wikimedia.org/r/595969

Change 596034 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] wgEventStreams and wgEventLoggingStreamNames Use +deploymentwiki for beta

https://gerrit.wikimedia.org/r/596034

Change 596034 merged by jenkins-bot:
[operations/mediawiki-config@master] wgEventStreams and wgEventLoggingStreamNames Use +deploymentwiki for beta

https://gerrit.wikimedia.org/r/596034

Change 596049 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[mediawiki/extensions/EventLogging@master] Use array_merge instead of array + when merging wgEventLoggingSchemas

https://gerrit.wikimedia.org/r/596049

Change 596049 merged by Ottomata:
[mediawiki/extensions/EventLogging@master] wgEventLoggingSchemas should override extension attributes

https://gerrit.wikimedia.org/r/596049

And mw.eventLog.logEvent("Test", {"OtherMessage": "test"} works from test.wikipedia.org too! It will also work from en.wikipedia.org after this week's MW train or after https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventLogging/+/596049 is deployed, whichever comes first :)

SearchSatisfaction has been migrated to EventGate on deployment-prep beta wiki. :)

Change 601749 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] Refactor JsonSchemaLoader into JsonLoader to allow for easy loading of remote JSON blobs

https://gerrit.wikimedia.org/r/601749

Change 603591 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] Refactor JsonSchemaLoader into JsonLoader to allow for easy loading of remote JSON blobs

https://gerrit.wikimedia.org/r/603591

Change 601749 abandoned by Ottomata:
Refactor JsonSchemaLoader into JsonLoader to allow for easy loading of remote JSON blobs

Reason:
in favor of https://gerrit.wikimedia.org/r/c/analytics/refinery/source/ /603591

https://gerrit.wikimedia.org/r/601749

Change 603591 merged by Ottomata:
[analytics/refinery/source@master] Refactor JsonSchemaLoader into JsonLoader to allow for easy loading of remote JSON blobs

https://gerrit.wikimedia.org/r/603591

Migration plan:

0. Switch all refine jobs to refinery 0.0.126 and make eventlogging_analytics use event_transforms.

For each EventLogging schema

  1. Create /analytics/legacy/<schema_name> schema
  2. Evolve eventlogging table to use new schema, e.g.
schema_name=searchsatisfaction
table="event.${schema_name}"
schema_uri="/analytics/legacy/${schema_name}/latest"

echo "Evolving $table using schema at $schema_uri"
spark2-submit --conf spark.driver.extraClassPath=/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common.jar:/srv/deployment/analytics/refinery/artifacts/hive-jdbc-1.1.0-cdh5.10.0.jar:/srv/deployment/analytics/refinery/artifacts/hive-service-1.1.0-cdh5.10.0.jar --driver-java-options='-Dhttp.proxyHost=webproxy.eqiad.wmnet -Dhttp.proxyPort=8080 -Dhttps.proxyHost=webproxy.eqiad.wmnet -Dhttps.proxyPort=8080' --class org.wikimedia.analytics.refinery.job.refine.tool.EvolveHiveTable  /srv/deployment/analytics/refinery/refinery-job.jar --table=
${table}" --schema_uri="${schema_uri}"
  1. Rolling deploy mediawiki-config changes (e.g. this one) to make EventLogging produce new schema data via EventGate.
  1. Once schema's data is fully produced through EventGate, use Refine job that uses schema repo instead of meta.wm.org:
    • If first EventLogging table migration, merge patch to make new Refine eventlogging_legacy job and add table to it.
    • else add table to Refine eventlogging_legacy job

Change 605955 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] refine.pp - bump refinery jar version and make eventlogging_analytics use event_transforms

https://gerrit.wikimedia.org/r/605955

Change 605989 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] event_transforms - Set legacy eventlogging ip field if it exists

https://gerrit.wikimedia.org/r/605989

Change 605989 merged by Ottomata:
[analytics/refinery/source@master] event_transforms - Set legacy eventlogging ip field if it exists

https://gerrit.wikimedia.org/r/605989

Change 605955 merged by Ottomata:
[operations/puppet@production] refine.pp - bump version and make eventlogging_analytics use event_transforms

https://gerrit.wikimedia.org/r/605955

Mentioned in SAL (#wikimedia-analytics) [2020-06-16T19:41:43Z] <ottomata> bumping Refine refinery jar version to 0.0.127 - T238230

Mentioned in SAL (#wikimedia-operations) [2020-06-19T18:10:07Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Bump eventlogging_Test schema version to 1.1.0 to pick up client_dt - T238230 (duration: 00m 59s)

Change 607017 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Set wgEventLoggingServiceUri for all wikis

https://gerrit.wikimedia.org/r/607017

Change 607017 merged by Ottomata:
[operations/mediawiki-config@master] Set wgEventLoggingServiceUri for all wikis

https://gerrit.wikimedia.org/r/607017

Mentioned in SAL (#wikimedia-operations) [2020-06-22T13:19:27Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Bump eventlogging_Test schema version to 1.1.0 to pick up client_dt and set wgEventLoggingServiceUri for all wikis - T238230 (duration: 00m 58s)

@Samwilson @Niharika Hello!

I'm looking for a candidate EventLogging schema stream to migrate to EventGate. The migration should be 100% backwards compatible. I was using SearchSatisfaction as my candidate schema, but on Friday I made a mistake and lost some data while doing the migration. This was user error on my part.

I'd like to try again, but before I do would like to prove that it works for a lower volume data stream. TemplateWizard looks like a good candidate. Would you mind if I used it as a guinea pig? I don't expect any issues (but I didn't last week either). No worries if you do mind, I can keep looking for a different candidate.

Thank you!

I think it'd be fine to use TemplateWizard logging as a guinea pig. I don't think anyone's doing much with the data at the moment.

Change 607333 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Migrate TemplateWizard from EventLogging to EventGate

https://gerrit.wikimedia.org/r/607333

Change 607333 merged by Ottomata:
[operations/mediawiki-config@master] Migrate TemplateWizard from EventLogging to EventGate on group0

https://gerrit.wikimedia.org/r/607333

Mentioned in SAL (#wikimedia-operations) [2020-06-23T18:53:36Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on group0 - T238230 (duration: 01m 06s)

Change 607346 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Migrate TemplateWizard from EventLogging to EventGate on all wikis

https://gerrit.wikimedia.org/r/607346

Change 607349 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/secondary@master] Add simple script to help converting EventLogging metawiki schemas

https://gerrit.wikimedia.org/r/607349

Change 607346 merged by Ottomata:
[operations/mediawiki-config@master] Migrate TemplateWizard from EventLogging to EventGate on all wikis

https://gerrit.wikimedia.org/r/607346

Mentioned in SAL (#wikimedia-operations) [2020-06-23T19:16:32Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on all wikis - T238230 (duration: 01m 05s)

Change 607349 merged by Ottomata:
[schemas/event/secondary@master] Add simple script to help converting EventLogging metawiki schemas

https://gerrit.wikimedia.org/r/607349

Mentioned in SAL (#wikimedia-operations) [2020-06-23T20:31:22Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on all wikis - take 2 - T238230 (duration: 01m 06s)

Change 607520 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Migrate SearchSatisfaction from EventLogging to EventGate on group1

https://gerrit.wikimedia.org/r/607520

Change 607520 merged by Ottomata:
[operations/mediawiki-config@master] Migrate SearchSatisfaction from EventLogging to EventGate on group1

https://gerrit.wikimedia.org/r/607520

Something I've overlooked:

Camus's eventlogging job uses the dt field for hourly partitioning. As we move events to EventGate, dt will now be set by EventLogging client side, which means it will be using the browser's time, which is untrustworthy. I don't know what can be done about this during the incremental roll out. E.g. right now SearchSatisfaction -> EventGate is deployed to only group0 wikis, so those ones have dt set by browsers, wheras all the others have dt set by eventlogging-processor. This could cause weird partitioning errors where data is written to camus partitions much after (or before) the current time.

As long as the browser dt isn't too far off (within 28 hours should be ok I think), then the data will be noticed by Refine and re-ingested. Once a schema is fully migrated to EventGate, we can configure it to be ingested by a Camus job that uses meta.dt instead of dt.

Ooof, but you can easily have outliers with offline features and buffered events sent in batch. The way goblin deals with late arrivals is cool, no?

Ah, for the most part, we won't be using the client's time for partitioning, its only during this incremental rollout that things are weird.

Change 593610 merged by Ottomata:
[operations/puppet@production] Add eventlogging_legacy Refine job for events migrated to EventGate

https://gerrit.wikimedia.org/r/c/operations/puppet/ /593610

scripts/eventlogging_legacy_schema_convert.js

is this script just used via node on the repo in which we store the schemas?

Change 649594 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/TemplateWizard@master] Switch event to use the new platform

https://gerrit.wikimedia.org/r/649594

Change 650093 had a related patch set uploaded (by Awight; owner: Awight):
[operations/mediawiki-config@master] Migrate TemplateWizard to full "new" events

https://gerrit.wikimedia.org/r/650093

Change 649594 merged by jenkins-bot:
[mediawiki/extensions/TemplateWizard@master] Switch event to explicitly use the new platform

https://gerrit.wikimedia.org/r/649594

Change 650093 abandoned by Awight:
[operations/mediawiki-config@master] Migrate TemplateWizard to full "new" events

Reason:

https://gerrit.wikimedia.org/r/650093