Status update (September 22nd, 2016)
September 22nd, 2016

(This post was copied from


This is the 22nd weekly update from revision scoring team that we have sent to this mailing list.

UI work:

  • We configured the default threshold for the ORES review tool on Wikidata to be more strict (higher recall, lower precision)[1]
  • We fixed a display issue on Special:Contributions where the filters would not wrap[2]

Increasing model fitness:

  • We finished demonstrating model fitness gains using hash-vector features[3]. Next, we'll be working to get the hash-vector features implemented in revscoring/ORES[4].
  • We implemented a new strategy for training and testing on all data using cross-validation[5]. This will both increase the fitness of the models and make the statistics reported more robust.

Maintenance and robustness

  • We fixed an indexing issues in ores_model that prevented the deployment of updated models[6].
  • We did a minor investigation to a short period of degraded service quality on WMF Labs[7]
  1. -- Change default threshold for Wikidata to high
  2. -- Filter on user contribs has nowrap, causing issues
  3. -- [Spike] Investigate HashingVectorizer
  4. -- Implement ~100 most important hash vector features in editquality models
  5. -- Train on all data, Report test statistics on cross-validation
  6. -- oresm_model index should not be unique
  7. -- Investigate short period of ores-web-03 insanity

Aaron from the Revision Scoring team

Written by Halfak on Jun 3 2017, 5:03 PM.
Principal Research Scientist