Page MenuHomePhabricator

Migrate legacy metawiki schemas to Event Platform
Open, HighPublic

Description

This task will track the migration of EventLogging schemas & stream so to Event Platform schemas.

Tracking and planning of what schemas to migrate is being done in the EventLogging Schema Migration Audit spreadsheet.

Explanation of what this means for legacy EventLogging schema owners:
https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy

We will do our best to migrate schemas in groups associated with teams.

Schemas produced by EventLogging extension

Schemas produced by other software


Migration plan for a schema:

1. Pick a schema to migrate
Schemas to migrate are listed in EventLogging Schema Migration Audit spreadsheet.

2. Create a new task to track this schema's migration

# This should work on MacOS to open a new Phab Task form in a browser with some fields already field out.
function new_el_migration_phab_task() {
    schema_name="$1"  
    open "https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?title=$schema_name Event Platform Migration&description=See: https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy

Unless otherwise notified, client IP and consequently geocoded data will no longer be collected for this event data after this migration. Please let us know if this should continue to be captured.  See also T262626.& &parent=259163&tags=Event-Platform&subscribers=Ottomata,Mforns"
}

new_el_migration_phab_task SearchSatisfaction

Paste this checklist into the new task. This checklist is a summary of the instructions described here.

  • 1. Pick a schema to migrate
  • 2. Create a new task to track this schema's migration
  • 3. Create /analytics/legacy/ schema
  • 4. Edit-protect the metawiki Schema page at https://meta.wikimedia.org/wiki/Schema:<SchemaName>
  • 5. Manually evolve the Hive table to use new schema
  • 6. Add entry to wgEventStreams, wgEventLoggingStreamNames and wgEventLoggingSchemas in operations/mediwiki-config
  • 7. Once the legacy stream's data is fully produced through EventGate, switch to using Refine job that uses schema repo instead of meta.wm.org
  • 8. Edit the producer extension.json and set EventLoggingSchemas to the new schema URI
  • 9. Once the producer extension.json is fully deployed, edit wgEventLoggingSchemas in operations/mediawiki-config InitialiseSettings.php and remove the schema's entry.
  • 10. Mark the schema as migrated in the EventLogging Schema Migration Audit spreadsheet

Link this task in the EventLogging Schema Migration Audit spreadsheet.

On the task, contact the owner of the schema and ask if they need client IP and/or geocoded data in the Hive table.

3. Create /analytics/legacy/<schemaname>/current.yaml schema

Using eventlogging_legacy_schema_convert script) in the schemas/event/secondary repository:

old_schema_name=SearchSatisfaction
new_schema_name=$(echo $old_schema_name | tr '[:upper:]' '[:lower:]')
mkdir ./jsonschema/analytics/legacy/$new_schema_name
node ./scripts/eventlogging_legacy_schema_convert.js $old_schema_name > ./jsonschema/analytics/legacy/$new_schema_name/current.yaml

You'll need to edit at least the JSONSchema examples in current.yaml. Easiest thing to do is get an event out of Kafka and use that as as staring point.

# Get the last event out of Kafka
kafkacat -C -b kafka-jumbo1001.eqiad.wmnet -o -1 -c 1 -t eventlogging_SearchSatisfaction

If the schema owner indicated that they need client IP and/or geocoded data in Hive, you'll need to add a $ref to the fragment/http/client_ip schema. Example here.

(!) Please, make sure that npm test is not thowing any errors other than snake_case inconsistencies (those are allowed for legacy schemas).
To do that, you need to first comment out L21 in test/jsonschema/repository.test.js, and then run npm test for your new schema to be checked.
When done, remember to revert test/jsonschema/repository.test.js to its original state before commiting!

4. Edit-protect the metawiki Schema page at https://meta.wikimedia.org/wiki/Schema:$old_schema_name

Use this as the edit-protect log message:

This schema has been moved to https://schema.wikimedia.org/#!//secondary/jsonschema/analytics/legacy. See also https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy

5. Manually evolve the Hive table to use new schema
Once the above schema is merged (you might have to wait 10 minutes after merge for Spark to be able to retrieve it):

old_schema_name=SearchSatisfaction
new_schema_name=$(echo $old_schema_name | tr '[:upper:]' '[:lower:]')
table="event.${new_schema_name}"
schema_uri="/analytics/legacy/${new_schema_name}/latest"

# First run in dry-run mode (the default) to see what EvolveHiveTable will do.
spark2-submit --conf spark.driver.extraClassPath=/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common.jar:/srv/deployment/analytics/refinery/artifacts/hive-jdbc-1.1.0-cdh5.10.0.jar:/srv/deployment/analytics/refinery/artifacts/hive-service-1.1.0-cdh5.10.0.jar --class org.wikimedia.analytics.refinery.job.refine.tool.EvolveHiveTable  /srv/deployment/analytics/refinery/artifacts/refinery-job-shaded.jar --table="${table}" --schema_uri="${schema_uri}"

# If that looks good, evolve the table:
spark2-submit --conf spark.driver.extraClassPath=/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common.jar:/srv/deployment/analytics/refinery/artifacts/hive-jdbc-1.1.0-cdh5.10.0.jar:/srv/deployment/analytics/refinery/artifacts/hive-service-1.1.0-cdh5.10.0.jar --class org.wikimedia.analytics.refinery.job.refine.tool.EvolveHiveTable  /srv/deployment/analytics/refinery/artifacts/refinery-job-shaded.jar --table="${table}" --schema_uri="${schema_uri}" --dry_run=false

6. Add entry to wgEventStreams, wgEventLoggingStreamNames and wgEventLoggingSchemas in operations/mediwiki-config
Rolling deploy changes to make EventLogging extension produce data to EventGate. Example: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/607333/2/wmf-config/InitialiseSettings.php

To test that events make it through the pipeline:
Check that events for your stream are still flowing through in this Grafana dashboard, or by consuming the eventlogging_$old_schema_name topic from Kafka.

You can submit an event via a browser developer console:

old_schema_name=Test;
event = { "OtherMessage" => "Hello from JS" } // example event here;
mw.eventLog.logEvent(old_schema_name, event);

Or, if a server side PHP EventLogging event, you can emit a test event with mwscript shell.php on deployment.eqiad.wmnet:

cd /srv/mediawiki-staging
mwscript shell.php --wiki testwiki

>>> $old_schema_name = 'Test';
>>> $event = [ "OtherMessage" => "Hello from PHP" ];
>>> EventLogging::logEvent( $old_schema_name, -1, $event );

7. Once the legacy stream's data is fully produced through EventGate, switch to using Refine job that uses schema repo instead of meta.wm.org

Also add the SchemaName to the eventlogging-processor disabled schemas list in puppet in modules/eventlogging/files/plugins.py**

Example: https://gerrit.wikimedia.org/r/c/operations/puppet/+/644259

This will prevent eventlogging-processor from producing what are now invalid legacy events from clients that are running old code.

Restart eventlogging-processor on eventlog1003:

sudo puppet agent -t  # make sure the change has been applied
sudo service eventlogging-processor@client-side-* restart

8. Edit the producer extension.json and set EventLoggingSchemas to the new schema URI
Example: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ContentTranslation/+/639578

9. Once the producer extension.json is fully deployed, edit wgEventLoggingSchemas in operations/mediawiki-config InitialiseSettings.php and remove the schema's entry.
Example: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/639579

10. Mark the schema as migrated in the EventLogging Schema Migration Audit spreadsheet

Details

Other Assignee
mforns
SubjectRepoBranchLines +/-
operations/puppetproduction+4 -0
operations/mediawiki-configmaster+0 -2
operations/mediawiki-configmaster+0 -0
operations/mediawiki-configmaster+1 -4
mediawiki/extensions/EventLoggingwmf/1.36.0-wmf.18+7 -7
mediawiki/extensions/EventLoggingwmf/1.36.0-wmf.16+7 -7
mediawiki/extensions/EventLoggingwmf/1.36.0-wmf.16+7 -7
mediawiki/extensions/EventLoggingmaster+7 -7
operations/deployment-chartsmaster+231 -231
eventgate-wikimediamaster+106 -5
operations/mediawiki-configmaster+11 -0
mediawiki/extensions/ContentTranslationmaster+1 -1
operations/puppetproduction+1 -0
schemas/event/secondarymaster+64 -51
schemas/event/secondarymaster+0 -3
schemas/event/secondarymaster+505 -0
schemas/event/secondarymaster+3 -3
operations/puppetproduction+1 -0
operations/mediawiki-configmaster+1 -3
operations/mediawiki-configmaster+11 -0
schemas/event/secondarymaster+487 -1
operations/mediawiki-configmaster+0 -1
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
ResolvedOttomata
OpenOttomata
DeclinedNone
OpenOttomata
Resolved Gilles
Resolvedmforns
Resolvedovasileva
DeclinedOttomata
ResolvedOttomata
Resolvedmforns
Resolved Mholloway
ResolvedOttomata
DuplicateNone
DuplicateNone
DuplicateNone
DuplicateNone
Resolved Mholloway
DuplicateNone
ResolvedOttomata
ResolvedSBisson
ResolvedSBisson
ResolvedSBisson
Resolvedmforns
ResolvedOttomata
Resolvedmforns
ResolvedOttomata
ResolvedOttomata
DeclinedNone
DeclinedNone
Resolved bmansurov
ResolvedJAllemandou
Resolvedmforns
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
DuplicateNone
ResolvedOttomata
DeclinedMNeisler
ResolvedSBisson
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedOttomata
ResolvedJdrewniak
ResolvedOttomata
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
ResolvedPginer-WMF
Resolvedphuedx
ResolvedMMiller_WMF
Resolvedphuedx
Resolvedphuedx
OpenNone
Resolvedphuedx
Resolvedphuedx
OpenNone
ResolvedEtonkovidova
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
OpenNone
OpenOttomata
Resolvedphuedx
Resolvedphuedx
Resolvedmatmarex
DeclinedNone
ResolvedOttomata
Resolvedmforns
ResolvedOttomata
DeclinedOttomata
ResolvedOttomata
ResolvedSharvaniharan
ResolvedSharvaniharan

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Ottomata updated Other Assignee, added: mforns.

Change 699786 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/puppet@production] Finalize backend migration of CentralNotice EL schemas

https://gerrit.wikimedia.org/r/699786

Change 699786 merged by Ottomata:

[operations/puppet@production] Finalize backend migration of CentralNotice EL schemas

https://gerrit.wikimedia.org/r/699786

awight updated the task description. (Show Details)

Removing task assignee due to inactivity as this open task has been assigned for more than two years. See the email sent to the task assignee on August 22nd, 2022.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome!
If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!

Keeping assigned to me. We are hunting down remainders in T282131, and then creating subtasks for each one. Official tracking is happening in the audit spreadsheet (linked in task description).

Almost all old EventLogging schemas have been (or are being) migrated. The remaining ones are mostly MobileWikiApp* ones. Here's a grafana chart showing the remaining eventlogging_MobileWikiApp.* topics with throughput. The line chart is messages/second, the table sums the total number of messages seen over the selected time period. The streams with the most events are MobileWikiAppSessions and MobileWikiAppLinkPreview.

IIUC, we are still waiting for these to be migrated to Metrics Platform.

@Sharvaniharan @SNowick_WMF @WDoranWMF @VirginiaPoundstone is this correct? Is this data still actively used?

Can we set a deadline for this? We've been trying to turn off legacy EventLogging systems for over 3 years now.

@Ottomata I do not use any of those schemas, we have migrated all our data to Metrics Platform but there may have been some issues with one or two migrations that I need to close out on my end. I will give you a deadline when I have confirmed the roster of migration tickets - sorry to have left you waiting.

I'd like to help y'all get closer to the goal. Shay and I are going to check in on this and follow up with engineering (apps + Metrics Platform) if there's anything left to do.

There's a chance all the events are from users of old versions of the apps that they haven't updated yet (and maybe never will).

In previous conversations (August 2020) with iOS and Android teams (back in BUOD days), the apps teams requested 6 months lead time

~6 Month of lead time before shutdown
iOS: Most ios users upgrade w/in a few weeks, w/in 2 versions, after a quarter or so we won’t care about legacy data; but some objective trigger (like percentage of traffic data). Don’t see any urgency to turn this off.
Android: When we release a new version the uptake of the new version takes a week or so to reach 50% but takes many months to reach 90%.

Depending on when the migration to Event Platform wrapped up, we can probably just forget about the legacy instruments still out there.

@Ottomata
Thank you for looping me in.
@SNowick_WMF , @mpopov Thank you for summarizing our update issues with Migration. Our complete migration was over more than a year ago, so we are good on he timeline, and can ignore any data coming into those legacy schemas. However I was curious to learn in the interest of users who were not able to update or did not want to update, will the legacy schema be deleted or just ignored and allowed to naturally trickle down?

@Sharvaniharan: once the legacy EL system is turned off, the tables associated with non-migrated legacy EL schemas will stop getting data. The data in any tables allowlisted in the event sanitization process will be copied & sanitized appropriately while the data is purged per the 90 day retention policy. Approximately 90 days after the system is turned off those tables will become empty.

@Ottomata: I'm curious: will empty tables for legacy EL schemas (that were not migrated) get dropped?

will empty tables for legacy EL schemas (that were not migrated) get dropped?

Only if we do so manually.

there may have been some issues with one or two migrations that I need to close out on my end. I will give you a deadline when I have confirmed the roster of migration tickets - sorry to have left you waiting.

Wow this is great news. @SNowick_WMF, I created T353014: Decommission all legacy EventLogging MobileWikiApp* schemas. to track this. Please comment there with your deadline and confirmation, and we will proceed! Thank you!

Actually... I just updated the EventLogging Audit Spreadsheet, and it seems if we decom the MobileWikiApp* schemas, there will be ONLY ONE remaining schema to migrate - T323828: Update Pingback to use the Event Platform.

I'm going to close the T353014 task I just created, and instead use the parent task to TOTALLY DECOMMISSION LEGACY EVENTLOGGING!!!!!!

@SNowick_WMF so um, just confirm here in this task and we will proceed!

Decommissioning EventLogging would be EPIC!

Confirming here, thank you for your work on this @Ottomata and congrats on the decom.

@SNowick_WMF, are latest versions of apps still sending the various MobileApp* events? I see a few events coming in, but maybe those are just from old versions?

@Ottomata: Removing task assignee as this open task has been assigned for more than two years - See the email sent to task assignee on October 11th.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome! :)
If this task has been resolved in the meantime, or should not be worked on by anybody ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!

I hope to find time for this again in the new year