Phame Blogs Score all the things
Score all the things
Blog of the Scoring Platform team

Status Update (May 2, 2018)

Written by awight on May 2 2018, 8:05 PM.


  • We've started work on JADE in earnest, and the prototype is deployed to the beta cluster where it's available for testing and tool development.
  • Draft topic prerequisites are mostly falling into place, so we should be able to get the initial model deployed this month.
  • New, dynamic ORES support table shows up-to-date information about our progress for each wiki: https://tools.wmflabs.org/ores-support-checklist/
  • ORES is served from its own cluster, which gave us a tremendous benefit in both performance and stability.
  • More ORES support for Arabic, Bengali, Catalan, Hungarian, Latvian, Swedish Wikipedia

Status Update (January 30, 2018)

Written by awight on Jan 30 2018, 7:24 PM.


  • Deployed Revscoring 2.0. Each scoring model includes statistics that can be used to query and choose an appropriate threshold depending on the use case.
  • Rewrote ORES extension, improving code quality and test coverage. Failures will cause graceful degradation rather than breaking pages that rely on ORES.
  • GCI happened and some work has been done on wikilabels.
  • The ORES labs cluster has been migrated to Debian Stretch, and we're ready to migrate production clusters.
  • "draft topic" model is trained and it works. Support for the model in ORES is ongoing.
  • New languages, new campaigns, new models. We've deployed advanced edit quality models to Simple English, Spanish, and Swedish Wikipedia, Spanish Wikibooks, and basic edit quality to Icelandic Wikipedia and Spanish Wikiquote. Preliminary edit quality campaigns are finished for Hungarian and Serbian Wikipedia.
  • JADE (auditing system) work is continuing, we have a database schema designed, some code written for the backend service, and have planned an event-based architecture plus content-handled Jade and Jade_talk namespaces within MediaWiki.
  • Draftquality data is cached in the ORES extension and is made available to other extensions.

Status update (October 6, 2017)

Written by awight on Oct 18 2017, 5:56 PM.

New language support for Bengali, Greek, and Tamil. New advance edit quality support for Albanian and Romanian. We cleaned up the old 'reverted' models where better support is available. We're working on moving to a new dedicated cluster. We improved some models by exploring new sources of signal and cleaning datasets. We started work on JADE and presented on The Keilana Effect at Wikimania.


Wikilabels incident: Reversed diffs!

Written by Halfak on Aug 31 2017, 2:02 PM.

Today, we discovered a major regression in Wikilabels. We've patched the issue and made an emergency deployment. We also deleted some labels that were saved while the system was compromised. In this post, we'll describe what happened.


More/better model information and "threshold optimizations"

Written by Halfak on Aug 29 2017, 10:41 PM.

Today, I'm writing to announce a breaking change in ORES that will come out about a month from now. It will only change how information about prediction models is stored and reported. This information is used by some tools to set thresholds at specified levels of confidence (e.g. "give me the threshold that gives 90% recall"). In this blog post, I'll explain how this is currently done and how it will be done once we deploy the change.


Laughing ORES to death with regular expressions and fake threads

Written by Halfak on Aug 17 2017, 9:29 PM.

At 1100 UTC on June 23rd, ORES started to struggle. Within a half hour, it had fully choked and could no longer respond to any requests. It took us 10 hours to diagnose the problem, solve it, and consider it solved. We learned some valuable lessons when studying and addressing this issue.


Announcing the Scoring Platform team

Written by Halfak on Jul 21 2017, 4:46 PM.

The Wikimedia Foundation’s new Scoring Platform team, led by Aaron Halfaker, will be working on democratizing access to AI, developing new types of AI predictions, and pushing the state of the art with regards to ethical practice of AI development.


Status update (July 11th, 2017)

Written by Halfak on Jul 12 2017, 10:44 PM.

Two outages with documentation. Revscoring 2.0 coming with better model information and "thresholds". New support for Romanian, Albanian, Tamil, Greek, and Bengali. We're officially welcoming @awight to the team!


Status update (June 3rd, 2017)

Written by Halfak on Jun 3 2017, 8:24 PM.

Updates now coming to the phame blog! We made presentations and gathered new collaborators at the Wikimedia Hackathon 2017 in Vienna. ORES is back in api.php. Wikilabels has stats. ORES in CODFW fell over for a while, but it's back.


Join my Reddit AMA about Wikipedia and ethical, transparent AI

Written by Halfak on Jun 3 2017, 6:35 PM.

I wanted to let you know about an upcoming experimental Reddit AMA ("ask me anything") chat we have planned. It will focus on artificial intelligence on Wikipedia and how we're working to counteract vandalism while also making life better for newcomers.


Status update (April 14th, 2017)

Written by Halfak on Jun 3 2017, 6:30 PM.

In this update, I'm going to change some things up to try and make this update easier for you to consume. The biggest change you'll notice is that I've broken up the [#] references in each section. I hope that saves you some scrolling and confusion. You'll also notice that I have changed the subject line from "Revision scoring" to "Scoring Platform" because it's now clear that, come July, I'll be leading a new team with that name at the Wikimedia Foundation. There'll be an announcement about that coming once our budget is finalized. I'll try to keep this subject consistent for the foreseeable future so that your email clients will continue to group the updates into one big thread.


AI Wishlist initialized and a new Phab Tag (January 31st, 2017)

Written by Halfak on Jun 3 2017, 6:21 PM.

I hosted the AI Wishlist session at the Developer Summit(T147710). At that session, we brainstormed a set of AIs that we think would be interesting to implement. Generally I asked people to do their best to follow template that would help us remember why the AI was important, what it would help with, and what resources might help get it implemented. See artificial-intelligence


Deployment of ORES review tool in Englis Wikipedia as a beta feature (August 23rd, 2016)

Written by Halfak on Jun 3 2017, 4:51 PM.

We The Revision Scoring Team
are happy to announce the deployment of the ORES review tool as a beta feature on *English Wikipedia*. Once enabled, ORES highlights edits that are likely to be damaging in Special:RecentChanges, Special:Watchlist and Special:Contributions to help you prioritize your patrolling work. ORES detects damaging edits using a basic prediction model based on past damage.

About Score all the things
No description.