Page MenuHomePhabricator

[Shared Event Platform] Produce new mediawiki.page-change stream from MediaWiki EventBus
Closed, ResolvedPublic

Description

We should produce entity based page state changes (schema design task: T308017) from MediaWiki EventBus, and use them as the input for T311084: [Shared Event Platform] Mediawiki Stream Enrichment should consume the consolidated page-change stream. and (reordering via T309784). This will greatly simplify the streaming pipelines, as well as be generally more useful than page-create, page-delete, and revision-create. Ideally we could one day even deprecate those streams...but that is a very long term goal.

Event Timeline

gmodena renamed this task from {Shared Event Platform] Produce new mediawii.page-change stream from MediaWiki EventBus to [Shared Event Platform] Produce new mediawii.page-change stream from MediaWiki EventBus.Jun 29 2022, 8:59 AM
Ottomata renamed this task from [Shared Event Platform] Produce new mediawii.page-change stream from MediaWiki EventBus to [Shared Event Platform] Produce new mediawiki.page-change stream from MediaWiki EventBus.Jun 29 2022, 12:05 PM

Change 821776 had a related patch set uploaded (by Aklapper; author: Ottomata):

[mediawiki/extensions/EventBus@master] WIP - Send page change events

https://gerrit.wikimedia.org/r/821776

Change 849144 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Declare mediawiki.page_change stream in beta

https://gerrit.wikimedia.org/r/849144

Change 849144 merged by Ottomata:

[operations/mediawiki-config@master] Declare mediawiki.page_change stream in beta

https://gerrit.wikimedia.org/r/849144

Change 851117 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change stream in beta metawiki

https://gerrit.wikimedia.org/r/851117

Change 851117 merged by jenkins-bot:

[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change stream in beta metawiki

https://gerrit.wikimedia.org/r/851117

Change 851122 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] beta - override mediawiki_page_change on metawiki

https://gerrit.wikimedia.org/r/851122

Change 851122 merged by Ottomata:

[operations/mediawiki-config@master] beta - override mediawiki_page_change on metawiki

https://gerrit.wikimedia.org/r/851122

Change 851127 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change and enable it only in beta wikipedias

https://gerrit.wikimedia.org/r/851127

Change 851127 merged by jenkins-bot:

[operations/mediawiki-config@master] Declare rc0.mediawiki.page_change and enable it only in beta wikipedias

https://gerrit.wikimedia.org/r/851127

Mentioned in SAL (#wikimedia-operations) [2022-10-31T20:33:46Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: No-op sync of InitialiseSettings.php to declare stream rc0.mediawiki.page_change. This stream is disabled everywhere by default, and only enabled in beta for now. - T311129 (duration: 03m 42s)

Alright! We are live in beta, and ready to go in selective wikis in prod as the train rolls out this week!

If you'd like to see this in beta now, the easiest way to do this is to go to https://stream-beta.wmflabs.org/v2/ui/#/?streams=rc0.mediawiki.page_change, click the big green 'Stream' button, and then go create or edit pages on simple.wikipedia.beta.wmflabs.org. E.g. if I edit https://simple.wikipedia.beta.wmflabs.org/wiki/OttoTest0, then I get the following event in the rc0.mediawiki.page_change stream:

changelog_kind: update
page_change_kind: edit
dt: '2022-10-31T20:42:05Z'
wiki_id: simplewiki
page:
  page_id: 322189
  page_title: OttoTest0
  namespace_id: 0
  is_redirect: false
performer:
  user_text: X.X.X.X
  groups:
    - '*'
  is_bot: false
  is_registered: false
  is_system: false
  is_temp: false
revision:
  rev_id: 3270043
  rev_dt: '2022-10-31T20:42:05Z'
  is_minor_edit: false
  rev_sha1: 54r5binzok7ld5xtknl0g29cqre9bmh
  rev_size: 20
  rev_parent_id: 3270042
  comment: ""
  comment_html: ""
  editor:
    user_text: X.X.X.X
    groups:
      - '*'
    is_bot: false
    is_registered: false
    is_system: false
    is_temp: false
  is_content_visible: true
  is_editor_visible: true
  is_comment_visible: true
  content_slots:
    main:
      slot_role: main
      content_model: wikitext
      content_sha1: 54r5binzok7ld5xtknl0g29cqre9bmh
      content_size: 20
      content_format: text/x-wiki
      origin_rev_id: 3270043
prior_state:
  revision:
    rev_id: 3270042
    rev_dt: '2022-10-31T20:41:04Z'
    is_minor_edit: false
    rev_sha1: i6ae2pu5j0q7y0r11egae4wzje65kj1
    rev_size: 15
    rev_parent_id: 3270041
    comment: ""
    comment_html: ""
    editor:
      user_text: X.X.X.X
      groups:
        - '*'
      is_bot: false
      is_registered: false
      is_system: false
      is_temp: false
    is_content_visible: true
    is_editor_visible: true
    is_comment_visible: true
    content_slots:
      main:
        slot_role: main
        content_model: wikitext
        content_sha1: i6ae2pu5j0q7y0r11egae4wzje65kj1
        content_size: 15
        content_format: text/x-wiki
        origin_rev_id: 3270042
$schema: /development/mediawiki/page/change/1.0.0
meta:
  stream: rc0.mediawiki.page_change
  uri: 'https://simple.wikipedia.beta.wmflabs.org/wiki/OttoTest0'
  id: 358716de-d2bf-4ecf-8e9c-f27f666d17f1
  request_id: Y2AzHDwPBJcUQE6H4h15eAAAAFQ
  domain: simple.wikipedia.beta.wmflabs.org
  dt: '2022-10-31T20:42:05Z'

Change 851636 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change stream on testwiki

https://gerrit.wikimedia.org/r/851636

Change 851636 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change stream on testwiki

https://gerrit.wikimedia.org/r/851636

Change 851637 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] rc0.mediawiki.page_change stream - use eventgate-analytics-external

https://gerrit.wikimedia.org/r/851637

Change 851637 merged by jenkins-bot:

[operations/mediawiki-config@master] rc0.mediawiki.page_change stream - use eventgate-analytics-external

https://gerrit.wikimedia.org/r/851637

Mentioned in SAL (#wikimedia-operations) [2022-11-01T14:14:51Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Use eventgate-analytics-external for rc0.mediawiki.page_change stream - T311129 (duration: 03m 42s)

Change 851705 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change on group0 wikis

https://gerrit.wikimedia.org/r/851705

Change 851705 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable rc0.mediawiki.page_change on group0 wikis

https://gerrit.wikimedia.org/r/851705

Mentioned in SAL (#wikimedia-operations) [2022-11-01T19:36:48Z] <otto@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Enable rc0.mediawiki.page_change on group0 wikis - T311129 (duration: 03m 38s)

We are live in testwiki!

@Ottomata which table in Hive can I see the page state change data?

event.rc0_mediawiki_page_change

BTW we are live on group0 wikis now.

Change 855146 had a related patch set uploaded (by Ottomata; author: Ottomata):

[schemas/event/primary@master] development/ page change - Remove comment_html fields, bump to 2.0.0

https://gerrit.wikimedia.org/r/855146

We use page_change as a drop-in replacement for revision-create in out search update pipeline now (see T322186). So far (local tests with captures prod events) it works. Updates follow once we run tests on production.

Nice, @pfischer, please keep your eye on T308017: Design Schema for page state and page state with content (enriched) streams, there are some structural changes we may make to the schema (flattening?) in the next RC.

Change 855146 abandoned by Ottomata:

[schemas/event/primary@master] development/ page change - Remove comment_html fields, bump to 2.0.0

Reason:

Done elsewhere.

https://gerrit.wikimedia.org/r/855146