Create small application that will be listening to the event stream and query Actions API to update the data store and collect the dataset so that it can be exposed through the API endpoint.
We will be tracking 2 kinds of events:
- New article/page being created - for this, we will be consuming from mediawiki.page-create stream
- Page namespace being moved - for this, we will be consuming from mediawiki.page-move stream. At the moment, the only kind of page namespace move that we want to track is move from any other namespace (that is not 0) to the main namespace (0). Refer to an example event of this kind below. This stream also contains event that are page/article title change. We will also update our table with the new article/page title. Refer to an example event of this kind below.
- Revision create - for this medaiwiki.revision-create stream
Acceptance criteria
Application that updates the data store from event stream is functional and dockerized.
To-Do
- create a database table with following fields
- name of the article (string)
- project identifier (database name)
- url of the article (string)
- editor names (array of strings)
- number of edits (number)
- date created (date)
- date modified (date)
- templates list (array of strings)
- date namespace moved (date)
- connect to the event stream to update this database table in realtim
- dockerize the application
Notes
Please use gorm as an ORM here so that we can replace the database if needed for better performance.
Event with page namespace move from other namespace to main namespace (0):
{ "$schema": "/mediawiki/page/move/1.0.0", "meta": { "uri": "https://en.wikipedia.org/wiki/Carnarvonia", "request_id": "fd5d173c-42eb-4fa2-96ea-f038595e77c8", "id": "e225ef13-8f6f-4bbe-8382-908262d32b61", "dt": "2022-11-02T20:44:44Z", "domain": "en.wikipedia.org", "stream": "mediawiki.page-move", "topic": "eqiad.mediawiki.page-move", "partition": 0, "offset": 12340431 }, "database": "enwiki", "performer": { "user_text": "YorkshireExpat", "user_groups": [ "extendedconfirmed", "extendedmover", "*", "user", "autoconfirmed" ], "user_is_bot": false, "user_id": 40577542, "user_registration_dt": "2020-11-15T16:33:58Z", "user_edit_count": 12752 }, "page_id": 39295033, "page_title": "Carnarvonia", "page_namespace": 0, "page_is_redirect": true, "rev_id": 1119675437, "prior_state": { "page_title": "Draft:Move/Carnarvonia", "page_namespace": 118, "rev_id": 1119675429 }, "comment": "[[WP:PMRC#4|Round-robin history swap]] step 3 using [[:en:User:Ahecht/Scripts/pageswap|pageswap]]", "parsedcomment": "<a href=\"/wiki/Wikipedia:PMRC#4\" class=\"mw-redirect\" title=\"Wikipedia:PMRC\">Round-robin history swap</a> step 3 using <a href=\"/wiki/User:Ahecht/Scripts/pageswap\" title=\"User:Ahecht/Scripts/pageswap\">pageswap</a>" }
Event with page/article title change:
{ "$schema": "/mediawiki/page/move/1.0.0", "meta": { "uri": "https://li.wiktionary.org/wiki/%E7%A5%9B", "request_id": "56b74f2d-f087-4b1a-823d-adc3ee76543c", "id": "daedcac0-f5a4-40e2-8531-3332f81f2650", "dt": "2022-11-03T13:50:44Z", "domain": "li.wiktionary.org", "stream": "mediawiki.page-move", "topic": "eqiad.mediawiki.page-move", "partition": 0, "offset": 12344661 }, "database": "liwiktionary", "performer": { "user_text": "Ooswesthoesbes", "user_groups": [ "bureaucrat", "interface-admin", "sysop", "*", "user", "autoconfirmed" ], "user_is_bot": false, "user_id": 46, "user_registration_dt": "2007-08-19T11:30:47Z", "user_edit_count": 257245 }, "page_id": 95343, "page_title": "祛", "page_namespace": 0, "page_is_redirect": false, "rev_id": 743576, "prior_state": { "page_title": "abercueramus", "page_namespace": 0, "rev_id": 314956 } }