Page MenuHomePhabricator

Develop a ML-based service to detect vandalism on Wikidata
Closed, ResolvedPublic

Description

The Research team in collaboration with the ML-Platform team are creating a new service to help Wikidata patrollers to detect revisions that might be reverted.

Requirments:

  • Model should be able to run in Lift Wing
  • Improve performance on existing models (see baselines comparison)

Event Timeline

Updates

  • We are working on manually evaluating reverts to identify the right data to train the model.

Update

  • Still working on the data evaluation. Currently I'm studying the use of tags and user groups and their relation with reverts.

Update

  • Currently I'm working on featuring engineering. The current model has around 72% accuracy on balanced data.

Update

  • New features had slightly improved the accuracy (now is 75%), I'm still working on improving the model.

Update

  • I'm testing a Deep Learning approach, to see if offers relevant advantages over the current XGBOOST model.

@Trokhymovych has addresed the comments and submitted the merge request. Model binary can be found here.
I'm going to coordinate with research engineers to decide next steps.

@Trokhymovych, please post here the models' performance results

Model Performance on Historical Holdout Testset*
ModelAUCPR@R0.99PR@R0.90PR@R0.50
Rule-based0.7677550.0757080.0757080.482298
ORES0.8670580.0832060.1285950.567549
Graph2vec model0.9222190.1021880.2256540.759679
Model Performance on Human Labeled Testset**
ModelAUCPR@R0.99PR@R0.90PR@R0.50
Rule-based0.8943600.7379200.9575470.957547
ORES0.9264120.8028800.9496490.963675
Graph2vec model0.9359370.8383460.9597630.967811

where PR@R - Precision at Recall level.

*Holdout dataset details: 127,489 revisions between 2023-05-01 and 2023-08-01. Only revisions with ORES predictions are included; self-reverts are filtered out. The revert rate is ~7.5%, and the anonymous rate is ~9.2%.

**Labeled testset details: 1,221 revisions between 2022-04-26 and 2023-07-31. Only revisions with ORES predictions are included; self-reverts are filtered out. The revert rate is ~73.8%, and the anonymous rate is ~69.5%.