Article topics are predicted after each page edits and stored in the search index to support the articletopic: search keyword. Currently using the revscoring model but it could eventually use the outlink model. The process to get the article topics into the search index is described here. However, there is currently no mechanism to retrieve which topics have been predicted for a given article.
In the context of content translation recommendations, and T369268: Custom translation suggestions: Multiple selection specifically, the Language and Product Localization team would like to be able to get the topics for a small number of articles that are part of community-defined page collections.
In the past, bringing ML prediction results, mostly ORES damaging and goodfaith models scores, inside MediaWiki was handled by the ORES MW extension for various purposes (RC, WL, Contribs, API, etc). Unlike the new article topics models, the damaging and goodfaith models required extensive language-training and as a result, the extension is only deployed to a handful of wikis.
So what are the options to query the topics for an article? Here are some options
- Get the topic from the doc in elastic and expose it in a new MW API or by extending an existing MW API like query/info or query/revisions
- Add caching to the LW API and query there directly
- tbd