Status update (October 6, 2017)

New language support for Bengali, Greek, and Tamil. New advance edit quality support for Albanian and Romanian. We cleaned up the old 'reverted' models where better support is available. We're working on moving to a new dedicated cluster. We improved some models by exploring new sources of signal and cleaning datasets. We started work on JADE and presented on The Keilana Effect at Wikimania.

See more details below.

New language support

We deployed basic edit quality support for Bengali, Greek, and Tamil. We've deployed advanced edit quality support for Albanian, Romanian. Progress was made towards new models for Latvian, Croatian, Bosnian, and Spanish, but these aren't deployed yet.

T166049: Deploy reverted model for elwiki
T156357: Deploy edit quality campaign for Romanian Wikipedia
T163009: Train/test damaging & goodfaith models for Albanian Wikipedia
T162031: Add language support for Latvian (lv)
T166048: Deploy reverted model for tawiki
T170490: Train reverted model for Bengali Wikipedia
T170491: Train reverted model for Greek Wikipedia
T174572: Reverted model for hrwiki
T173087: Add language support for Bosnian
T175628: Add LV dictionary to install.
T172046: Add language support for Croatian (
T131963: Complete eswiki edit quality campaign
T174687: Add language support for Serbian

See this full table for reference,

Moving to the new, dedicated cluster

Until now, we've been running ORES on a shared Services cluster. We're happy to announce that the ORES API will be served from a dedicated cluster, probably in a matter of weeks. Stress tests showed some issues that we're still resolving.
T117560: New Service Request: ORES
T169246: Stress/capacity test new ores* cluster

Cleaning up Wikilabels data

@Natalia found some systematic errors in our training data, and corrected several. We also improved the structure of the labeling form to make it more difficult to make cognitive mistakes while labeling.
T171491: Unlabeled goodfaith observations are assumed "false" -- should be "true"
T171497: Review training set to check strange examples of labels
T171493: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith"

Maintenance and documentation

We've been working with Releng on git-lfs (Large File Storage) so that our repositories won't be so big but we'll still be able to maintain historical model versions.
T171619: ORES should use a git large file plugin for storing serialized binaries

We were able to begin work with @srodlund to improve our technical and user documentation.


Remove "reverted" model where advanced editquality models are available

This was a noteworthy cleanup: on any wiki where the "damaging" and "goodfaith" models are available, these should be used instead of the "reverted" model. To that end, we're removing the reverted model from these wikis. We held an RFC and no concerns were raised.
T171059: [RfC] Should we remove all reverted models when there is a damaging one?
T172370: Remove reverted models from editquality repo

More, better signal

We experimented with adding Flagged Revs data to our training set
T166235: Flagged revs approve model to fiwiki

@Sumit ran several experiments to see if word sentiment analysis could improve our classifier health. We were able to get marginal benefits and so implemented the strategy.
T167305: Experiment with Sentiment score feature for draftquality
T170177: Test draftquality sentiment feature on Editquality

@Natalia ran some experiments with including image-removals in the edit quality models and that didn't seem to affect performance.
T172049: [Investigate] Get signal from adding/removing images

@Nettrom cleaned up the article quality data for English Wikipedia and that allowed us to boost fitness in strange cases (e.g. redirect pages)
T170434: Improve cleaning of article quality assessment datasets

@Ladsgroup added strategies for scanning labels and descriptions for badwords.
T162617: Use 'informals', 'badwords', etc. in Wikidata feature set
T170834: Add basic bad word check to Wikidata feature set

New model proposal: Draft topic prediction

We're working on better ways for routing new page drafts to subject matter experts for review. See our documentation pages. We'll have datasets and some modeling experiments completed soon.

At Wikimania 2017
T170015: [Workshop] How can I get ORES for my wiki?

JADE schema and design

We've spent some time planning how we'll implement the JADE system, which enables ORES users to give us feedback and have that feedback integrated into score results.

T175192: Design JADE scoring schema

For more info see the project's home page ( and sub-pages &

We're actively recruiting ORES stakeholders to be part of our working group.


We're in the process of rolling out a major refactor of the core revscoring library. One of the most exciting new features is the ability for ORES API consumers to fine-tune the thresholds used to define prediction intervals, e.g. "Very likely damaging". These thresholds will be different on every wiki, and the new interface allows us to query statistics built into the model, and satisfy criteria like "get me the threshold with the maximum filter rate, with a recall of at least 90%".

For more details see the blog post, Blog Post: More/better model information and "threshold optimizations"

Written by awight on Oct 18 2017, 5:56 PM.