Page MenuHomePhabricator

Migrate Growth EventLogging schemas to Event Platform
Open, Needs TriagePublic

Description

Migrate these schemas following the process described in T259163:

  • NewcomerTask*
  • HomepageVisit
  • HomepageModule
  • HelpPanel
  • ServerSideAccountCreation

(*) Migration for NewcomerTask had already started on T259163 before this task was created. Some patches for it are assigned to that task.

See also: https://wikitech.wikimedia.org/wiki/Event_Platform/EventLogging_legacy


Patches for the creation of schemas:

Details

ProjectBranchLines +/-Subject
mediawiki/extensions/GrowthExperimentsmaster+3 -3
operations/mediawiki-configmaster+10 -0
schemas/event/secondarymaster+1 -4
schemas/event/secondarymaster+436 -5
schemas/event/secondarymaster+0 -6
schemas/event/secondarymaster+518 -0
operations/mediawiki-configmaster+0 -1
operations/puppetproduction+0 -2
operations/puppetproduction+8 -0
operations/mediawiki-configmaster+3 -5
operations/mediawiki-configmaster+1 -3
operations/mediawiki-configmaster+27 -0
operations/mediawiki-configmaster+9 -0
operations/mediawiki-configmaster+9 -0
operations/mediawiki-configmaster+9 -0
operations/mediawiki-configmaster+10 -1
schemas/event/secondarymaster+518 -0
schemas/event/secondarymaster+1 K -0
schemas/event/secondarymaster+615 -0
Show related patches Customize query in gerrit

Event Timeline

mforns created this task.Nov 5 2020, 3:05 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 5 2020, 3:05 PM

Change 639539 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate EventLogging NewcomerTask to EventGate on all wikis

https://gerrit.wikimedia.org/r/639539

@nettrom_WMF Should EditorJourney and ServerSideAccountCreation also be grouped in with these?

Ottomata updated the task description. (Show Details)Nov 5 2020, 5:19 PM

@Ottomata : It would be helpful to have ServerSideAccountCreation grouped with these, I've updated the task description to reflect that.

EditorJourney is currently inactive, we turned it off in T252391 (I've now updated the schema description on meta). I'm unsure whether we should migrate it as a legacy schema, or keep it inactive and instead spin it up as an Event Platform schema when we next need it. I suspect we'll be needing it in Q3 for T216668, does that sound about right @MMiller_WMF?

Maybe @Ottomata and @kostajh can discuss the engineering cost/benefits for the options of keeping it legacy or migrating to MEP and make a decision on that?

Maybe @Ottomata and @kostajh can discuss the engineering cost/benefits for the options of keeping it legacy or migrating to MEP and make a decision on that?

The costs are small, we'd just don't want to migrate things that are unused. Our main goal is to be able to turn off the deprecated eventlogging backkend, and we can't do that until all data currently flowing that way now is either disabled or moved to EventGate. We could choose to not migrate it now, but if you want to turn it back on we can always migrate it later! We'd just have to do so before you turn it back on.

mforns added a comment.Nov 5 2020, 6:05 PM

Also, @nettrom_WMF, can you confirm whether we should migrate all those at the exact same time, or just migrate them close enough? This is in reference to @Ottomata's comment in the email thread.

The team has three other schemas in use that are related to NewcomerTask (HomepageVisit, HomepageModule, and HelpPanel). I think all of these should migrate together since they're frequently used together in reporting. It would be convenient to track all of them in a single phab task. I can create that as a subtask unless there are good reasons not to.

We can do that! I hadn't created any Phab tasks because 1. The data will not change for users and 2. I didn't want to make so many tasks! But we can indeed for these, and we can indeed switch them together. Do you mean you would like us to flip the switch for these all at the exact same time, or just that they should be migrated as a group close together. We can do either!

Thanks!

kzimmerman moved this task from Triage to Tracking on the Product-Analytics board.Nov 5 2020, 6:38 PM

Change 639611 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Add HomepageVisit schema to analytics legacy

https://gerrit.wikimedia.org/r/639611

mforns updated the task description. (Show Details)Nov 5 2020, 8:06 PM

Also, @nettrom_WMF, can you confirm whether we should migrate all those at the exact same time, or just migrate them close enough?

Thanks for looping in that part of the conversation, @mforns! The main usage of these schemas is @MMiller_WMF's weekly reporting, which if I remember correctly he does some time on Mondays (Pacific time). So I don't think it's necessary to migrate them all at the same time, but within the same week. That way, we can make one set of updates to Marshall's notebook (reusing T255517 for it) and get the next week's reports going.

mforns added a comment.EditedNov 6 2020, 5:24 PM

Thanks for the clarification @nettrom_WMF, will work on that once we agree on T240460.

Ottomata moved this task from Incoming to Event Platform on the Analytics board.Nov 16 2020, 4:42 PM
Ottomata updated the task description. (Show Details)Nov 23 2020, 7:46 PM
Ottomata updated the task description. (Show Details)

Hiya @nettrom_WMF, we'd like to migrate these schemas during the week of Dec 7 - Dec 11. Let us know if you have questions or objections! :)

EditorJourney is currently inactive

Let's not migrate this then. We can always do so if you want to turn it back on in the future. I'm marking to Deprecate in the EventLogging Schema Audit spreadsheet, FYI @kostajh.

@nettrom_WMF I may have already asked you this elsewhere, but I'll ask again here so we have an officially documented answer.

Do any of these event streams need client IP and/or geocoded data? If not, it will be removed as part of this migration.

@nettrom_WMF I may have already asked you this elsewhere, but I'll ask again here so we have an officially documented answer.

Do any of these event streams need client IP and/or geocoded data? If not, it will be removed as part of this migration.

It was mentioned on Slack, but it's good to have it documented here and so I appreciate you bringing it up again! Neither of these streams need client IP or geocoded data, removing that is good.

I'm following up on the proposed migration window with the Growth team. I don't expect any issues with it, but will return with concerns if there are any.

mforns moved this task from Next Up to In Progress on the Analytics-Kanban board.Nov 30 2020, 4:05 PM

Hiya @nettrom_WMF, we'd like to migrate these schemas during the week of Dec 7 - Dec 11. Let us know if you have questions or objections! :)

I've run that plan by the team and there has been no objections, you can go ahead with that scheduled migration!

Change 647374 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Add HelpPanel schema to analytics legacy

https://gerrit.wikimedia.org/r/647374

Change 647377 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Add ServerSideAccountCreation to analytics legacy

https://gerrit.wikimedia.org/r/647377

Change 647382 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate HelpPanel schema from EventLogging to EventGate

https://gerrit.wikimedia.org/r/647382

Change 647383 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate HomepageModule schema from EventLogging to EventGate

https://gerrit.wikimedia.org/r/647383

Change 647384 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate HomepageVisit schema from EventLogging to EventGate

https://gerrit.wikimedia.org/r/647384

Change 647386 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate ServerSideAccountCreation schema from EventLogging to EventGate

https://gerrit.wikimedia.org/r/647386

Change 639611 merged by Mforns:
[schemas/event/secondary@master] Add HomepageVisit schema to analytics legacy

https://gerrit.wikimedia.org/r/639611

Change 647374 merged by Mforns:
[schemas/event/secondary@master] Add HelpPanel schema to analytics legacy

https://gerrit.wikimedia.org/r/647374

Change 647377 abandoned by Mforns:
[schemas/event/secondary@master] Add ServerSideAccountCreation to analytics legacy

Reason:
Given we have to wait for changes in the PHP client, which will be paused for a bit, I'm abandoning this change to not cause any confusion. Let's recreate later on.

https://gerrit.wikimedia.org/r/647377

Change 647386 abandoned by Mforns:
[operations/mediawiki-config@master] Migrate ServerSideAccountCreation schema from EventLogging to EventGate

Reason:
Given that this schema depends on the PHP client, and its development will be paused for a bit, I will abandon to avoid confusion and open patches. Let's recreate later on.

https://gerrit.wikimedia.org/r/647386

Change 647382 abandoned by Mforns:
[operations/mediawiki-config@master] Migrate HelpPanel schema from EventLogging to EventGate

Reason:
Abandoning to merge this and other two patches into the same change.

https://gerrit.wikimedia.org/r/647382

Change 647383 abandoned by Mforns:
[operations/mediawiki-config@master] Migrate HomepageModule schema from EventLogging to EventGate

Reason:
Abandoning to merge this and other two patches into the same change.

https://gerrit.wikimedia.org/r/647383

Change 647384 abandoned by Mforns:
[operations/mediawiki-config@master] Migrate HomepageVisit schema from EventLogging to EventGate

Reason:
Abandoning to merge this and other two patches into the same change.

https://gerrit.wikimedia.org/r/647384

Change 647782 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate Growth schemas from EventLogging to EventGate on testwiki

https://gerrit.wikimedia.org/r/647782

Change 647782 merged by jenkins-bot:
[operations/mediawiki-config@master] Migrate Growth schemas from EventLogging to EventGate on testwiki

https://gerrit.wikimedia.org/r/647782

Mentioned in SAL (#wikimedia-operations) [2020-12-10T20:54:35Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Migrate Growth EventLogging schemas to Event Platform on testwiki - T267333 (duration: 01m 03s)

Change 647811 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate EventLogging Growth schemas to EventGate on all wikis

https://gerrit.wikimedia.org/r/647811

Change 639539 abandoned by Mforns:
[operations/mediawiki-config@master] Migrate EventLogging NewcomerTask to EventGate on all wikis

Reason:
Abandoning in favor of https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/ /647811

https://gerrit.wikimedia.org/r/639539

Change 647811 merged by Ottomata:
[operations/mediawiki-config@master] Migrate EventLogging Growth schemas to EventGate on all wikis

https://gerrit.wikimedia.org/r/647811

Mentioned in SAL (#wikimedia-operations) [2020-12-10T23:01:25Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Migrate Growth EventLogging schemas to Event Platform on all wikis - T267333 (duration: 01m 09s)

Change 647817 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Refine Growth schemas using eventlogging_legacy job

https://gerrit.wikimedia.org/r/647817

Change 647817 merged by Ottomata:
[operations/puppet@production] Refine Growth schemas using eventlogging_legacy job

https://gerrit.wikimedia.org/r/647817

Mentioned in SAL (#wikimedia-analytics) [2020-12-11T19:30:27Z] <ottomata> now ingesting Growth EventLogging schemas using event platform refine job; they are exclude-listed from eventlogging-processor. - T267333

Change 648334 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Do not refine HomepageVisit using eventlogging_legacy job

https://gerrit.wikimedia.org/r/648334

Change 648334 merged by Ottomata:
[operations/puppet@production] Do not refine HomepageVisit using eventlogging_legacy job

https://gerrit.wikimedia.org/r/648334

Change 648336 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Remove HomepageVisit from wgEventLoggingSchemas

https://gerrit.wikimedia.org/r/648336

Change 648336 merged by Ottomata:
[operations/mediawiki-config@master] Remove HomepageVisit from wgEventLoggingSchemas

https://gerrit.wikimedia.org/r/648336

Mentioned in SAL (#wikimedia-operations) [2020-12-11T20:11:05Z] <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Un-migrtate Growth EventLogging schema HomepageVisit back to EventLogging-backend on all wikis (this is a server side event which is not yet ready to migrate) - T267333 (duration: 00m 58s)

Change 648337 had a related patch set uploaded (by Mforns; owner: Mforns):
[mediawiki/extensions/GrowthExperiments@master] Set EventLogging schemas to Event Platform schema URI

https://gerrit.wikimedia.org/r/648337

Hey all!

The following schemas have been migrated successfully.

  • NewcomerTask
  • HomepageModule
  • HelpPanel

The other schemas have not been migrated, because they depend on server-side client changes that are paused. We'll work on these whenever possible.

  • HomepageVisit
  • ServerSideAccountCreation

The following schemas have been migrated successfully. [...]

Thank you!

The other schemas have not been migrated, because they depend on server-side client changes that are paused. We'll work on these whenever possible.

Is there a task or more information about the pause on the server-side client changes?

mforns moved this task from In Progress to Paused on the Analytics-Kanban board.Tue, Jan 5, 3:33 PM

Hey all!
Given that T253121: MEP Client MediaWiki PHP is now resolved, I will go ahead and finish the migration of the 2 remaining schemas that depended on the last changes of the PHP client.
Will comment here once the migration is done.
Cheers!

Change 655715 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Modify analytics/legacy/homepagevisit to account for changes on metawiki

https://gerrit.wikimedia.org/r/655715

Change 655720 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Add ServerSideAccountCreation to analytics/legacy

https://gerrit.wikimedia.org/r/655720

Change 655720 merged by Mforns:
[schemas/event/secondary@master] Add ServerSideAccountCreation to analytics/legacy

https://gerrit.wikimedia.org/r/655720

Change 655715 merged by Mforns:
[schemas/event/secondary@master] Modify analytics/legacy/homepagevisit to account for changes on metawiki

https://gerrit.wikimedia.org/r/655715

Change 655723 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/mediawiki-config@master] Migrate HomepageVisit and ServerSideAccountCreation to Event Platform on testwiki

https://gerrit.wikimedia.org/r/655723

Mentioned in SAL (#wikimedia-operations) [2021-01-12T19:25:03Z] <ottomata> rolling restart of eventgate-analytics-external pods to clear schema caches - T267333

Change 655723 merged by jenkins-bot:
[operations/mediawiki-config@master] Migrate HomepageVisit and ServerSideAccountCreation to Event Platform on testwiki

https://gerrit.wikimedia.org/r/655723

Mentioned in SAL (#wikimedia-operations) [2021-01-12T19:48:59Z] <tgr_> synced Config: [[gerrit:655723|Migrate HomepageVisit and ServerSideAccountCreation to Event Platform on testwiki (T267333)]]

Change 655753 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Add maxLength to formatted field in analytics/legacy/serversideaccountcreation

https://gerrit.wikimedia.org/r/655753

Change 655753 abandoned by Mforns:
[schemas/event/secondary@master] Add maxLength to formatted field in analytics/legacy/serversideaccountcreation

Reason:
This idea did not work, trying something else.

https://gerrit.wikimedia.org/r/655753

Change 655756 had a related patch set uploaded (by Mforns; owner: Mforns):
[schemas/event/secondary@master] Add maxLength to formatted field in analytics/legacy/serversideaccountcreation

https://gerrit.wikimedia.org/r/655756

Change 655756 merged by Mforns:
[schemas/event/secondary@master] Add maxLength to formatted field in analytics/legacy/serversideaccountcreation

https://gerrit.wikimedia.org/r/655756

Ottomata added a comment.EditedWed, Jan 13, 7:18 PM

FYI, I don't know if these are new, but there are a fair number of validation errors for HomepageModule:

  '.event.module' should be equal to one of the allowed values, '.event.state' should be equal to one of the allowed values
mforns moved this task from Paused to In Progress on the Analytics-Kanban board.Mon, Jan 18, 3:14 PM
mforns moved this task from In Progress to Paused on the Analytics-Kanban board.Thu, Jan 21, 5:53 PM