June 3rd, 2017

Updates now coming to the phame blog! We made presentations and gathered new collaborators at the Wikimedia Hackathon 2017 in Vienna. ORES is back in api.php. Wikilabels has stats. ORES in CODFW fell over for a while, but it's back.

Hey folks,

I'll be starting to post updates here on the phame blog from now on, but if you'd prefer to be notified via the mailing lists we used to post to, that's OK. I'll make sure that the highlights and the link to these posts gets pushed there too.

We had a big presence at the Wikimedia Hackathon 2017 in Vienna. We kicked off a lot of new language focused collaborations and we deployed a new Item Quality model for Wikidata.

French and Finnish Wikipedias now have advance edit quality prediction support!

ORES is available through api.php again via rvprop=orescores and rcprop=oresscores.

Wiki labels now has a new stats reporting interface. Check out

We had a major hiccup when failing over to CODFW, but we worked it out and ORES is very happy again.

See the sections below for details.

Labeling campaigns

We deployed a new edit quality labeling campaign to English Wiktionary(T165876) and we're looking for someone who can work as a liaison for this task. We've also deployed secondary labeling campaigns to Finnish Wikipedia(T166558) and Turkish Wikipedia(T164672). These secondary campaigns help us improve ORES accuracy.

Outreach & comms

We hosted a session at the Wikimedia Hackathon to tell people about ORES and show how to work with us to get support for your local wiki(T165397). We also worked with the Collaboration Team to announce that ORES Review Tool would not be enabled by default and the New Filters would be deployed as a beta feature(T163153).

New development

Lots of things here. In our modeling library, we implemented the basics of Greek and Bengali language assets so that we can start working on prediction models(T166793, T162620). After talking to people at the Wikimedia Hackathon about peculiar language overlap, we implemented a regex exclusions strategy(T166793) that will allow us to clearly state that "ha" is not laughing in Hungarian or Italian, but it is in a lot of other contexts.

We also spent some time exploring the overlap of the "damaging" and "goodfaith" models on Wikipedia(T163995). We were able to show that there's useful overlap that will allow editors working on newcomer socialization to find goodfaith newcomer who are running into trouble. The Collaboration Team adjusted the thresholds in New Filters in response to our analysis(T164621).

Using data from Wiki labels(T157495), we trained a basic item quality model for Wikidata(T164862) and demonstrated it at the Wikimedia Hackathon(T166054). We used data from Wiki labels(T130261, T163012) to build advanced edit quality models for French and Finnish Wikipedia(T130282, T163013) and those are now deployed in ORES(T166047).

We implemented a new stats reporting interface in Wiki labels(T139956) and announced it (T166529). This interface makes it easier for people managing campaigns in Wiki labels to track progress. It's a long time coming. Props to @Ladsgroup for doing a bunch of work to make it happen.

Finally, we implemented a new "score_revisions" utility that makes it quick and easy to generate scores for a set of revisions using the ORES service(T164547). This is really useful for researchers who want lots of scores and would like to avoid taking down ORES. Personally, I've been using it to audit ORES.

Maintenance and robustness

We did a major deployment of ORES in mid-April(T162892) that had some serious problems in CODFW, but not EQIAD which was super confusing (T163950), so we re-routed traffic to EQIAD(350487). While investigating, we found out that some timeouts(T163944) and server errors(T163171, T163764, T163798) were due to the same problem: There were two servers in CODFW that we didn't know existed so they weren't getting new deployment and were poisoning our worker queue with old code!

We also fixed a couple of regressions that popped up in the ORES Review Tool while new work was being done on New Filters (T165011, T164984). We fixed some weird tokenization issues due to diacritics in Bengali not being handled correctly(T164767).

We re-enabled ORES in api.php(T163687). Props to @Tgr for making this happen.

We fixed some issues with ORES swagger documentation(T162184) and some UI issues in Wiki labels related to button colors(T163222) and confusing error messages(T138563).


We finished off some data-flow diagrams for ORES(T154441). As part of transitioning to a Wikimedia Foundation team (Scoring Platform! Woot!), we've moved all the documentation for ORES and our team to Also, as part of the Tech Ops experimentation with failovers across datacenters, we updated our grafana metrics tracking to split metrics by datacenter(T163212). This helped us quite a bit with diagnosing the deployment issues we discussed in the last section.

That's all folks. I hope you enjoyed the new format!

Written by Halfak on Jun 3 2017, 8:24 PM.
Principal Research Scientist
Arlolra, bd808, Tgr, Ladsgroup
"Love" token, awarded by chasemp.