Slack thread: https://wikimedia.slack.com/archives/C05F8ERE2CV/p1769524602443549
Use Case
Query multilingual revert risk scores for a month of revisions via SQL instead of multiple API calls, needed for analysis in T374698.
Currently, the event.mediawiki_page_revert_risk_prediction_change_v1 table contains only RRLA (revertrisk-language-agnostic) predictions, not revertrisk-multilingual predictions.
Proposed Solution
Produce Multilingual RevertRisk predictions to the mediawiki.page_revert_risk_prediction_change stream, following the RRLA implementation approach. The Multilingual model is slower and more resource-intensive than RRLA, so model optimization may be needed first.
Related Work
- T326179 - Previous work implementing RRLA event stream
- T405358 - Add LiftWing streams data to event_sanitized (increase data retention)
- Better access paths for LiftWing data in MediaWiki+