We should produce entity based page state changes (schema design task: T308017) from MediaWiki EventBus, and use them as the input for T311084: [Shared Event Platform] Mediawiki Stream Enrichment should consume the consolidated page-change stream. and (reordering via T309784). This will greatly simplify the streaming pipelines, as well as be generally more useful than page-create, page-delete, and revision-create. Ideally we could one day even deprecate those streams...but that is a very long term goal.
Description
Details
Event Timeline
Dunno why my WIP is not showing up attached to this task. Here it is: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/821776
Change 821776 had a related patch set uploaded (by Aklapper; author: Ottomata):
[mediawiki/extensions/EventBus@master] WIP - Send page change events
@Ottomata: Because it did not follow https://www.mediawiki.org/wiki/Gerrit/Commit_message_guidelines :)
Change 849144 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] Declare mediawiki.page_change stream in beta
Change 849144 merged by Ottomata:
[operations/mediawiki-config@master] Declare mediawiki.page_change stream in beta
Change 851117 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change stream in beta metawiki
Change 851117 merged by jenkins-bot:
[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change stream in beta metawiki
Change 851122 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] beta - override mediawiki_page_change on metawiki
Change 851122 merged by Ottomata:
[operations/mediawiki-config@master] beta - override mediawiki_page_change on metawiki
Change 851127 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change and enable it only in beta wikipedias
Change 851127 merged by jenkins-bot:
[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change and enable it only in beta wikipedias
Mentioned in SAL (#wikimedia-operations) [2022-10-31T20:33:46Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: No-op sync of InitialiseSettings.php to declare stream rc0.mediawiki.page_change. This stream is disabled everywhere by default, and only enabled in beta for now. - T311129 (duration: 03m 42s)
Alright! We are live in beta, and ready to go in selective wikis in prod as the train rolls out this week!
If you'd like to see this in beta now, the easiest way to do this is to go to https://stream-beta.wmflabs.org/v2/ui/#/?streams=rc0.mediawiki.page_change, click the big green 'Stream' button, and then go create or edit pages on simple.wikipedia.beta.wmflabs.org. E.g. if I edit https://simple.wikipedia.beta.wmflabs.org/wiki/OttoTest0, then I get the following event in the rc0.mediawiki.page_change stream:
changelog_kind: update page_change_kind: edit dt: '2022-10-31T20:42:05Z' wiki_id: simplewiki page: page_id: 322189 page_title: OttoTest0 namespace_id: 0 is_redirect: false performer: user_text: X.X.X.X groups: - '*' is_bot: false is_registered: false is_system: false is_temp: false revision: rev_id: 3270043 rev_dt: '2022-10-31T20:42:05Z' is_minor_edit: false rev_sha1: 54r5binzok7ld5xtknl0g29cqre9bmh rev_size: 20 rev_parent_id: 3270042 comment: "" comment_html: "" editor: user_text: X.X.X.X groups: - '*' is_bot: false is_registered: false is_system: false is_temp: false is_content_visible: true is_editor_visible: true is_comment_visible: true content_slots: main: slot_role: main content_model: wikitext content_sha1: 54r5binzok7ld5xtknl0g29cqre9bmh content_size: 20 content_format: text/x-wiki origin_rev_id: 3270043 prior_state: revision: rev_id: 3270042 rev_dt: '2022-10-31T20:41:04Z' is_minor_edit: false rev_sha1: i6ae2pu5j0q7y0r11egae4wzje65kj1 rev_size: 15 rev_parent_id: 3270041 comment: "" comment_html: "" editor: user_text: X.X.X.X groups: - '*' is_bot: false is_registered: false is_system: false is_temp: false is_content_visible: true is_editor_visible: true is_comment_visible: true content_slots: main: slot_role: main content_model: wikitext content_sha1: i6ae2pu5j0q7y0r11egae4wzje65kj1 content_size: 15 content_format: text/x-wiki origin_rev_id: 3270042 $schema: /development/mediawiki/page/change/1.0.0 meta: stream: rc0.mediawiki.page_change uri: 'https://simple.wikipedia.beta.wmflabs.org/wiki/OttoTest0' id: 358716de-d2bf-4ecf-8e9c-f27f666d17f1 request_id: Y2AzHDwPBJcUQE6H4h15eAAAAFQ domain: simple.wikipedia.beta.wmflabs.org dt: '2022-10-31T20:42:05Z'
Change 851636 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change stream on testwiki
Change 851636 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change stream on testwiki
Change 851637 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] rc0.mediawiki.page_change stream - use eventgate-analytics-external
Change 851637 merged by jenkins-bot:
[operations/mediawiki-config@master] rc0.mediawiki.page_change stream - use eventgate-analytics-external
Mentioned in SAL (#wikimedia-operations) [2022-11-01T14:14:51Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Use eventgate-analytics-external for rc0.mediawiki.page_change stream - T311129 (duration: 03m 42s)
Change 851705 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change on group0 wikis
Change 851705 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change on group0 wikis
Mentioned in SAL (#wikimedia-operations) [2022-11-01T19:36:48Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Enable rc0.mediawiki.page_change on group0 wikis - T311129 (duration: 03m 38s)
Change 855146 had a related patch set uploaded (by Ottomata; author: Ottomata):
[schemas/event/primary@master] development/ page change - Remove comment_html fields, bump to 2.0.0
We use page_change as a drop-in replacement for revision-create in out search update pipeline now (see T322186). So far (local tests with captures prod events) it works. Updates follow once we run tests on production.
Nice, @pfischer, please keep your eye on T308017: Design Schema for page state and page state with content (enriched) streams, there are some structural changes we may make to the schema (flattening?) in the next RC.
Change 855146 abandoned by Ottomata:
[schemas/event/primary@master] development/ page change - Remove comment_html fields, bump to 2.0.0
Reason:
Done elsewhere.