Page MenuHomePhabricator

Develop a ML-based service to detect vandalism on Wikidata
Open, Needs TriagePublic

Description

The Research team in collaboration with the ML-Platform team are creating a new service to help Wikidata patrollers to detect revisions that might be reverted.

Requirments:

  • Model should be able to run in Lift Wing

Event Timeline

Updates

  • We are working on manually evaluating reverts to identify the right data to train the model.

Update

  • Still working on the data evaluation. Currently I'm studying the use of tags and user groups and their relation with reverts.

Update

  • Currently I'm working on featuring engineering. The current model has around 72% accuracy on balanced data.

Update

  • New features had slightly improved the accuracy (now is 75%), I'm still working on improving the model.

Update

  • I'm testing a Deep Learning approach, to see if offers relevant advantages over the current XGBOOST model.