Status update (September 6th, 2016)
September 6th, 2016

(This post was copied from https://lists.wikimedia.org/pipermail/ai/2016-September/000087.html)


This is the 20th weekly update from revision scoring team that we have sent
to this mailing list.

New development:

  • We implemented the basic functionality for handling bag of words and other types of abstract feature vectors in revscoring. [1] This required some changes to some dependencies as well. [2]
  • We extended the user-group related features to include more of the dominant groups outside of English Wikipedia [3] and incremented the models that changed substantially [4]


  • We extended the documentation at mw:Extension:ORES to make it easier for new developers to work with us. [5]


  • We discussed the teams resourcing needs (hardware, engineering, and community liaison support) with Wes Moran. [6]

Maintenance and robustness:

  • We addressed a variety of issues around caching and how the ORES extension loads new data
  • ORES now returns headers that will disable secondary caching. [7]
  • Our maintenance scripts will circumvent caches that do not listen to no-cache headers. [8, 9]
  • We fixed an issue where the ORES review tool would duplicate items in Special:RecentChanges. [10]
  • We standardized the extraction pattern for the enwiktionary model so that it looks similar to other models. [11]
  1. https://phabricator.wikimedia.org/T132580 -- Implement abstraction for Sparse Feature Vectors
  2. https://phabricator.wikimedia.org/T144430 -- Update yamlconf so that import_path can handle deep attributes
  3. https://phabricator.wikimedia.org/T143909 -- Extend user group features
  4. https://phabricator.wikimedia.org/T144855 -- Increment ruwiki editquality models
  5. https://phabricator.wikimedia.org/T144676 -- Improve technical documentation in Extension:ORES in mediawiki.ore
  6. https://phabricator.wikimedia.org/T144517 -- ORES and Product: resourcing discussion
  7. https://phabricator.wikimedia.org/T144193 -- Set max-age header to 0 seconds for ORES to quiet secondary caches
  8. https://phabricator.wikimedia.org/T144196 -- Get model version needs to invalidate cache
  9. https://phabricator.wikimedia.org/T144195 -- Check model version replaces every time it runs.
  10. https://phabricator.wikimedia.org/T144233 -- Redundant results in ORES review tool
  11. https://phabricator.wikimedia.org/T144605 -- Fix makefile entry for enwiktionary.rev_reverted.20k_2016.tsv

Aaron from the Revision Scoring team

Written by Halfak on Jun 3 2017, 4:59 PM.
Principal Research Scientist

Event Timeline