We would like to run the same analysis ran in T392148: Run analysis to retrieve thresholds for high impact wikis to deploy recent changes revert risk language agnostic filters to to extract the threshold corresponding to a false positive rate of less than 15% for English Wikipedia.
As part of this investigation we want to figure out if the current way of calculating thresholds (jupyter notebook or converted python script) is a scalable enough way for larger wikis.
In the case where we find that the current way doesn't scale we should identify potential solutions that would allow us to run this analysis on all wikis.
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| ores-extension: add threshold for revertrisk in enwiki | operations/mediawiki-config | master | +3 -0 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T398291 AI/ML Infrastructure Request: Expand ORES-enabled RevertRisk filters deployment to all wikis, excluding Commons and Wikidata | |||
| Resolved | gkyziridis | T400590 Investigate revertrisk threshold generation for enwiki |
Event Timeline
Initial try to run threshold-analysis.py for enwiki. It seems that the query that runs retrieve zero data for enwiki, although is the exact same query which ran on T392148: Run analysis to retrieve thresholds for high impact wikis to deploy recent changes revert risk language agnostic filters to ticket.
Query and Results:
After experimentation I observed that this query returns zero data for any wiki, even the ones that I was running for in the initial ticket.
I think we should find people to review this query and consult us to where the data exist at the moment.
Update
I achieved to run the threshold analysis for enwiki successfully.
The obstacle I faced was that I could not run the query for a bigger than one month window. English wiki has the highest volume of data which causes the machine to crash when querying data for more than 1 month.
The threshold analysis for the wikis in T392148: Run analysis to retrieve thresholds for high impact wikis to deploy recent changes revert risk language agnostic filters to ran on a 12 months window between 2024-01-01 to 2025-01-01. \
This experiment ran on the latest data on batches (one month per batch), so the threshold analysis ran multiple times (once per batch).
You can find the results (per month) in the following paste:
Since the variance is pretty small in the threshold distribution (one threshold per month), we can drive into the assumption that we can use the average threshold as the final one:
Avg Optimal Threshold for 15% FPR: {'enwiki': 0.83092}
Change #1177446 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):
[operations/mediawiki-config@master] ores-extension: add threshold for revertrisk in enwiki
Change #1177446 merged by jenkins-bot:
[operations/mediawiki-config@master] ores-extension: add threshold for revertrisk in enwiki
Mentioned in SAL (#wikimedia-operations) [2025-08-14T07:07:24Z] <gkyziridis@deploy1003> Started scap sync-world: Backport for [[gerrit:1177446|ores-extension: add threshold for revertrisk in enwiki (T400590)]]
Mentioned in SAL (#wikimedia-operations) [2025-08-14T07:09:46Z] <gkyziridis@deploy1003> gkyziridis, isaranto: Backport for [[gerrit:1177446|ores-extension: add threshold for revertrisk in enwiki (T400590)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
Mentioned in SAL (#wikimedia-operations) [2025-08-14T07:19:31Z] <gkyziridis@deploy1003> Finished scap sync-world: Backport for [[gerrit:1177446|ores-extension: add threshold for revertrisk in enwiki (T400590)]] (duration: 12m 07s)