Page MenuHomePhabricator

Migrate legacy metawiki schemas to Event Platform
Open, Needs TriagePublic

Description

This task will track the migration of EventLogging schemas & stream so to Event Platform schemas.

Tracking and planning of what schemas to migrate is being done in the Schema Migration Audit spreadsheet.


Migration plan for a schema:
  • Manually evolve the Hive table to use new schema:
schema_name=searchsatisfaction
table="event.${schema_name}"
schema_uri="/analytics/legacy/${schema_name}/latest"

echo "Evolving $table using schema at $schema_uri"
spark2-submit --conf spark.driver.extraClassPath=/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common.jar:/srv/deployment/analytics/refinery/artifacts/hive-jdbc-1.1.0-cdh5.10.0.jar:/srv/deployment/analytics/refinery/artifacts/hive-service-1.1.0-cdh5.10.0.jar --driver-java-options='-Dhttp.proxyHost=webproxy.eqiad.wmnet -Dhttp.proxyPort=8080 -Dhttps.proxyHost=webproxy.eqiad.wmnet -Dhttps.proxyPort=8080' --class org.wikimedia.analytics.refinery.job.refine.tool.EvolveHiveTable  /srv/deployment/analytics/refinery/refinery-job.jar --table=${table}" --schema_uri="${schema_uri}"
  • Edit the producer extension.json and set EventLoggingSchemas to the new schema URI.
    • Once this change is fully deployed, edit wgEventLoggingSchemas in InitialiseSettings.php and remove the schema's entry.

Event Timeline

Ottomata created this task.Jul 29 2020, 5:15 PM
Ottomata updated the task description. (Show Details)Jul 29 2020, 6:01 PM
jlinehan moved this task from Goal Backlog to Goals on the Product-Infrastructure-Data board.
mforns moved this task from Incoming to Event Platform on the Analytics board.Aug 10 2020, 3:34 PM

Change 620051 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/mediawiki-config@master] Remove SearchSatisfaction from wgEventLoggingSchemas

https://gerrit.wikimedia.org/r/620051

Ottomata updated the task description. (Show Details)Aug 13 2020, 3:15 PM

Change 620051 merged by Ottomata:
[operations/mediawiki-config@master] Remove SearchSatisfaction from wgEventLoggingSchemas

https://gerrit.wikimedia.org/r/620051

Mentioned in SAL (#wikimedia-operations) [2020-08-17T16:23:06Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingSchemas - remove unneeded override for SearchSatisfaction - T259163 (duration: 00m 56s)

Ottomata updated the task description. (Show Details)Aug 17 2020, 5:03 PM
Ottomata updated the task description. (Show Details)Aug 17 2020, 5:47 PM
Ottomata updated the task description. (Show Details)Aug 17 2020, 5:52 PM

Thanks @Ottomata this is great! I'd like to set something up this week to go down the list and identify owner teams, creation/last edited date, etc. to help triage. Does that sound good or were you going to modify the table more?

Sounds great! I mostly did this so I could find one or two easy ones to start doing. I just created T260582: Migrate EventLogging MediaViewer data to Event Platform, maybe that one will be simple?

Hm @jlinehan, maybe we should start by identifying schemas that don't need migrated, and that we can totally deactivate ASAP. I betcha we could prune our list down quite a lot, eh?

Ottomata moved this task from Backlog to Next Up on the Event-Platform board.Aug 25 2020, 4:06 PM
Ottomata added a subscriber: sdkim.Aug 25 2020, 7:27 PM
Ottomata updated the task description. (Show Details)Thu, Sep 3, 2:54 PM