The scope of the project probably deserves a Phabricator tag, but we can start collecting ideas and suggestions from other people and discuss what road to take.
Some useful links:
https://www.mediawiki.org/wiki/ORES/Components
https://www.mediawiki.org/wiki/ORES/Applications
https://toolhub.wikimedia.org/search?q=vandalism
Clients
The clients can be divided into multiple macro areas:
- ChangeProp and revision-score events: when a revision-create event is generated, ChangeProp calls the precache ORES endpoint to get a score for all the models associated with the revid's wiki. For example, if the revision-create event carries rev-id 123456 for enwiki, then ORES is contacted to score that revision for all models compatible with enwiki (the list is part of a ORES configuration). ChangeProp then uses the following code to generate a mediawiki.revision-score event, that is sent to EventGate. We opened T301878 to address this use case in Lift Wing in a more modern way.
- External clients hitting ores.wikimedia.org: mostly bots that want to score rev-ids in batches to help the community in fighting vandalism. See "Counter Vandalism) in the above /Applications link for an initial list (that may be old and not accurate).
- MediaWiki extension ORES: This frontend displays revscoring data on the Special:Contributions and Special:RecentChanges pages. A FetchScoreJob event is created in response to the RecentChange_save event, which fetches scores from the ORES API and caches them in the local MediaWiki database for efficient access. It looks that the PHP code is highly configurable but it will need to be adapted to work with the Lift Wing API.
Migration strategies
The Lift Wing API is still to be decided due to T288789, but we'll likely have two main entry points:
- api.wikimedia.org for the external clients
- an internal discovery endpoint for internal clients.
With "internal clients" we mean ChangeProp + MediaWiki and researchers/data-engineers/etc.. that need to contact Lift Wing.
Migrating external users from ores.wikimedia.org to api.wikimedia.org will surely be a problem, since:
- We don't know the exact list of bots/tools and their codebases, together with owners and their point of contact.
- Moving from ores.wikimedia.org to api.wikimedia.org is not only a change of endpoint, but also a change in API calls. Due to what written above, it may become difficult since some tools/bots don't have a clear owner.
- Some bots/tools are incredibly vital for the community, but their codebase may not be owned by somebody available to change code etc..
We have essentially two options:
- Create a thin rewrite/transition layer behind an endpoint like ores-legacy.wikimedia.org, that simply gets ORES-like API calls (maybe a limited set) and "translates" them to Lift Wing ones.
- Follow up with all bot owners asking to migrate their tools to Lift Wing, keeping up ORES for the time being.