Page MenuHomePhabricator

Create stream listener handler for Breaking News PoC
Closed, ResolvedPublic13 Estimated Story Points

Description

Create small application that will be listening to the event stream and query Actions API to update the data store and collect the dataset so that it can be exposed through the API endpoint.

We will be tracking 2 kinds of events:

  1. New article/page being created - for this, we will be consuming from mediawiki.page-create stream
  2. Page namespace being moved - for this, we will be consuming from mediawiki.page-move stream. At the moment, the only kind of page namespace move that we want to track is move from any other namespace (that is not 0) to the main namespace (0). Refer to an example event of this kind below. This stream also contains event that are page/article title change. We will also update our table with the new article/page title. Refer to an example event of this kind below.
  3. Revision create - for this medaiwiki.revision-create stream

Acceptance criteria
Application that updates the data store from event stream is functional and dockerized.

To-Do

  • create a database table with following fields
    • name of the article (string)
    • project identifier (database name)
    • url of the article (string)
    • editor names (array of strings)
    • number of edits (number)
    • date created (date)
    • date modified (date)
    • templates list (array of strings)
    • date namespace moved (date)
  • connect to the event stream to update this database table in realtim
  • dockerize the application

Notes
Please use gorm as an ORM here so that we can replace the database if needed for better performance.

Event with page namespace move from other namespace to main namespace (0):

{
  "$schema": "/mediawiki/page/move/1.0.0",
  "meta": {
    "uri": "https://en.wikipedia.org/wiki/Carnarvonia",
    "request_id": "fd5d173c-42eb-4fa2-96ea-f038595e77c8",
    "id": "e225ef13-8f6f-4bbe-8382-908262d32b61",
    "dt": "2022-11-02T20:44:44Z",
    "domain": "en.wikipedia.org",
    "stream": "mediawiki.page-move",
    "topic": "eqiad.mediawiki.page-move",
    "partition": 0,
    "offset": 12340431
  },
  "database": "enwiki",
  "performer": {
    "user_text": "YorkshireExpat",
    "user_groups": [
      "extendedconfirmed",
      "extendedmover",
      "*",
      "user",
      "autoconfirmed"
    ],
    "user_is_bot": false,
    "user_id": 40577542,
    "user_registration_dt": "2020-11-15T16:33:58Z",
    "user_edit_count": 12752
  },
  "page_id": 39295033,
  "page_title": "Carnarvonia",
  "page_namespace": 0,
  "page_is_redirect": true,
  "rev_id": 1119675437,
  "prior_state": {
    "page_title": "Draft:Move/Carnarvonia",
    "page_namespace": 118,
    "rev_id": 1119675429
  },
  "comment": "[[WP:PMRC#4|Round-robin history swap]] step 3 using [[:en:User:Ahecht/Scripts/pageswap|pageswap]]",
  "parsedcomment": "<a href=\"/wiki/Wikipedia:PMRC#4\" class=\"mw-redirect\" title=\"Wikipedia:PMRC\">Round-robin history swap</a> step 3 using <a href=\"/wiki/User:Ahecht/Scripts/pageswap\" title=\"User:Ahecht/Scripts/pageswap\">pageswap</a>"
}

Event with page/article title change:

{
  "$schema": "/mediawiki/page/move/1.0.0",
  "meta": {
    "uri": "https://li.wiktionary.org/wiki/%E7%A5%9B",
    "request_id": "56b74f2d-f087-4b1a-823d-adc3ee76543c",
    "id": "daedcac0-f5a4-40e2-8531-3332f81f2650",
    "dt": "2022-11-03T13:50:44Z",
    "domain": "li.wiktionary.org",
    "stream": "mediawiki.page-move",
    "topic": "eqiad.mediawiki.page-move",
    "partition": 0,
    "offset": 12344661
  },
  "database": "liwiktionary",
  "performer": {
    "user_text": "Ooswesthoesbes",
    "user_groups": [
      "bureaucrat",
      "interface-admin",
      "sysop",
      "*",
      "user",
      "autoconfirmed"
    ],
    "user_is_bot": false,
    "user_id": 46,
    "user_registration_dt": "2007-08-19T11:30:47Z",
    "user_edit_count": 257245
  },
  "page_id": 95343,
  "page_title": "祛",
  "page_namespace": 0,
  "page_is_redirect": false,
  "rev_id": 743576,
  "prior_state": {
    "page_title": "abercueramus",
    "page_namespace": 0,
    "rev_id": 314956
  }
}

Event Timeline

Protsack.stephan updated the task description. (Show Details)
Daria_Kevana changed the task status from Open to In Progress.Nov 6 2022, 11:18 PM
Daria_Kevana changed the task status from In Progress to Open.Nov 17 2022, 1:41 PM
Daria_Kevana changed the status of subtask T322259: Create an API endpoint for Breaking News PoC from In Progress to Open.