Status update (November 10th, 2016)
November 10th, 2016

(This post was copied from


This is the 29th weekly update from revision scoring team that we have sent
to this mailing list.


  • We deployed logging changes to ORES that will reduce the verbosity[1]
  • We also deployed revscoring 1.3.0 and new models built with it to WMF labs[2]. This won't change anything important from a user-perspective, but it paves the way for developing new modeling strategies.

Maintenance and robustness:

  • We fixed puppet so that log file directories are also created on the celery worker nodes (affects wmflabs)[3]
  • We fixed an issue with our recall_at_fpr metrics which was incorrectly defined and implemented a recall_at_precision metric to take its place[4]

New development:

  • We've made a lot of progress on modeling sentences and have just started experimenting with a sentence model from featured articles[5]
  • We're reviewing a dataset of spam/vandalism/attack new page creations for public release[6]. This dataset will help our collaborators work with us on modeling the quality of drafts and supporting new page triage.
  1. -- Deploy logging changes to ORES
  2. -- Deploy revscoring 1.3.0 and updated editquality and wikiclass to wmflabs
  3. -- /srv/log/ores/ not created on worker nodes
  4. -- Implement recall at precision (and fix FPR metrics)
  5. -- Implement sentences datascources & experiment with normalization.
  6. -- Create manually vetted dataset of spam/vandalism/attack pages

Aaron from the Revision Scoring team

Written by Halfak on Jun 3 2017, 5:48 PM.
Principal Research Scientist