Recent Posts

Measuring Wikipedia page load times

Written by Krinkle on Tue, Jan 9, 6:25 PM in The speed of thought.

This post shows how we measure and interpret load times on Wikipedia. It also explains what real-user metrics are, and how percentiles work.


Ubuntu Trusty now deprecated for new WMCS instances

Written by Andrew on Nov 20 2017, 6:32 PM in Clouds & Unicorns.

Long ago, the Wikimedia Operations team made the decision to phase out use of Ubuntu servers in favor of Debian. It's a long, slow process that is still ongoing, but in production Trusty is running on an ever-shrinking minority of our servers.


The journey to Thumbor, part 3: development and deployment strategy

Written by Gilles on Nov 20 2017, 12:42 PM in The speed of thought.

In the last blog post I described where Thumbor fits in our media thumbnailing stack. Introducing Thumbor replaces an existing service, and as such it's important that it doesn't preform worse than its predecessor. We came up with a strategy to reach feature parity and ensure a launch that would be invisible to end users.


The journey to Thumbor, part 2: thumbnailing architecture

Written by Gilles on Nov 17 2017, 3:17 PM in The speed of thought.

Thumbor has now been serving all public thumbnail traffic for Wikimedia production since late June 2017.


Selenium Ruby framework deprecated

Written by zeljkofilipin on Oct 30 2017, 1:44 PM in Doing the needful.

This is your friendly but final warning that we are replacing Selenium tests written in Ruby with tests in Node.js. There will be no more reminders. Ruby stack will no longer be maintained. For more information see T139740 and T173488.


Tech talk: Selenium tests in Node.js

Written by zeljkofilipin on Oct 27 2017, 12:04 PM in Doing the needful.

Željko Filipin, Engineer (Contractor) from Release Engineering team. That's me! 👋


Status update (October 6, 2017)

Written by awight on Oct 18 2017, 5:56 PM in Score all the things.

New language support for Bengali, Greek, and Tamil. New advance edit quality support for Albanian and Romanian. We cleaned up the old 'reverted' models where better support is available. We're working on moving to a new dedicated cluster. We improved some models by exploring new sources of signal and cleaning datasets. We started work on JADE and presented on The Keilana Effect at Wikimania.


Automated OpenStack Testing, now with charts and graphs

Written by Andrew on Sep 29 2017, 9:26 PM in Clouds & Unicorns.

One of our quarterly goals was "Define a metric to track OpenStack system availability". Despite the weak phrasing, we elected to not only pick something to measure but also to actually measure it.


New Wiki Replica servers ready for use

Written by bd808 on Sep 25 2017, 11:43 PM in Clouds & Unicorns.

The current physical servers for the <wiki>_p Wiki Replica databases are at the end of their useful life. Work started over a year ago on a project involving the DBA team and cloud-services-team to replace these aging servers (T140788). Besides being five years old, the current servers have other issues that the DBA team took this opportunity to fix:

  • Data drift from production (T138967)
  • No way to give different levels of service for realtime applications vs analytics queries
  • No automatic failover to another server when one failed

Selenium Ruby framework deprecation (September)

Written by zeljkofilipin on Sep 25 2017, 3:27 PM in Doing the needful.

Originally an email sent on September 25 2017 to qa, engineering and wikitech-l mailing lists.


Selenium Ruby framework deprecation

Written by zeljkofilipin on Sep 25 2017, 3:14 PM in Doing the needful.

Originally an email sent on August 23 2017 to qa, engineering and wikitech-l mailing lists.


Selenium tests in Node.js

Written by zeljkofilipin on Sep 25 2017, 2:57 PM in Doing the needful.

Originally an-email sent on April 3 2017 to qa, engineering and wikitech-l mailing lists.


Introducing the Cloud Services Team: What we do, and how we can help you

Written by bd808 on Sep 13 2017, 6:44 PM in Clouds & Unicorns.

24% of Wikipedia edits over a three month period in 2016 were completed by software hosted in Cloud Services projects. In the same time period, 3.8 billion Action API requests were made from Cloud Services. We are the newly formed Cloud Services team at the Foundation, which maintains a stable and efficient public cloud hosting platform for technical projects relevant to the Wikimedia movement. --


Wikilabels incident: Reversed diffs!

Written by Halfak on Aug 31 2017, 2:02 PM in Score all the things.

Today, we discovered a major regression in Wikilabels. We've patched the issue and made an emergency deployment. We also deleted some labels that were saved while the system was compromised. In this post, we'll describe what happened.


More/better model information and "threshold optimizations"

Written by Halfak on Aug 29 2017, 10:41 PM in Score all the things.

Today, I'm writing to announce a breaking change in ORES that will come out about a month from now. It will only change how information about prediction models is stored and reported. This information is used by some tools to set thresholds at specified levels of confidence (e.g. "give me the threshold that gives 90% recall"). In this blog post, I'll explain how this is currently done and how it will be done once we deploy the change.


Tool creation added to

Written by bd808 on Aug 29 2017, 3:41 AM in Clouds & Unicorns. is a management interface for Toolforge users. On 2017-08-24, a new major update to the application was deployed which added support for creating new tool accounts and managing metadata associated with all tool accounts.


New dedicated puppetmasters for cloud instances

Written by Andrew on Aug 22 2017, 10:29 PM in Clouds & Unicorns.

Back in year zero of Wikimedia Labs, shockingly many services were confined to a single box. A server named 'virt0' hosted the Wikitech website, Keystone, Glance, Ldap, Rabbitmq, ran a puppetmaster, and did a bunch of other things.


Laughing ORES to death with regular expressions and fake threads

Written by Halfak on Aug 17 2017, 9:29 PM in Score all the things.

At 1100 UTC on June 23rd, ORES started to struggle. Within a half hour, it had fully choked and could no longer respond to any requests. It took us 10 hours to diagnose the problem, solve it, and consider it solved. We learned some valuable lessons when studying and addressing this issue.



Written by phuedx on Aug 16 2017, 12:58 PM in Leave it to the prose.

The Reading Web team recently discovered a bug in Firefox wherein a load event is fired when Firefox loads certain pages from its Back-Forward Cache (BFCache). To JavaScript on those pages, this event is a second load event (the first having been fired before the user navigated away from the page). This proved to be problematic for the cornerstone of our instrumentation, the EventLogging extension and delayed the deployment of Page Previews for approximately three months.


Toolforge provides proxied mirrors of cdnjs and now fontcdn, for your usage and user-privacy

Written by Quiddity on Aug 2 2017, 1:55 AM in Clouds & Unicorns.

Tool owners want to create accessible and pleasing tools. The choice of fonts has previously been difficult, because directly accessing Google's large collection of open source and freely licensed fonts required sharing personally identifiable information (PII) such as IPs, referrer headers, etc with a third-party (Google). Embedding external resources (fonts, css, javascript, images, etc) from any third-party into webpages hosted on Toolforge or other Cloud VPS projects causes a potential conflict with the Wikimedia Privacy Policy. Web browsers will attempt to load the resources automatically and this will in turn expose the user's IP address, User-Agent, and other information that is by default included in an HTTP request to the third-party. This sharing of data with a third-party is a violation of the default Privacy Policy. With explict consent Toolforge and Cloud VPS projects can collect and share some information, but it is difficult to secure that consent with respect to embedded resources.


Announcing the Scoring Platform team

Written by Halfak on Jul 21 2017, 4:46 PM in Score all the things.

The Wikimedia Foundation’s new Scoring Platform team, led by Aaron Halfaker, will be working on democratizing access to AI, developing new types of AI predictions, and pushing the state of the art with regards to ethical practice of AI development.


Toolforge Elasticsearch upgraded to 5.3.2

Written by bd808 on Jul 14 2017, 12:49 AM in Clouds & Unicorns.

The shared Elasticsearch cluster hosted in Toolforge was upgraded from 2.3.5 to 5.3.2 today (T164842). This upgrade comes with a lot of breaking API changes for clients and indexes, and should have been announced in advance. @bd808 apologizes for that oversight.


Status update (July 11th, 2017)

Written by Halfak on Jul 12 2017, 10:44 PM in Score all the things.

Two outages with documentation. Revscoring 2.0 coming with better model information and "thresholds". New support for Romanian, Albanian, Tamil, Greek, and Bengali. We're officially welcoming @awight to the team!


Official Debian Stretch image now available

Written by Andrew on Jun 20 2017, 4:00 PM in Clouds & Unicorns.

Debian Stretch was officially released on Saturday[1], and I've built a new Stretch base image for VPS use in the WMF cloud. All projects should now see an image type of 'debian-9.0-stretch' available when creating new instances.


The journey to Thumbor, part 1: rationale

Written by Gilles on Jun 20 2017, 3:33 PM in The speed of thought.

We are currently in the final stages of deploying Thumbor to Wikimedia production, where it will generate media thumbnails for all our public wikis. Up until now, MediaWiki was responsible for generating thumbnails.


Watroles returns! (In a different place and with a different name and totally different code.)

Written by Andrew on Jun 20 2017, 3:26 AM in Clouds & Unicorns.

Back in the dark ages of Labs, all instance puppet configuration was handled using the puppet ldap backend. Each instance had a big record in ldap that handled DNS, puppet classes, puppet variables, etc. It was a bit clunky, but this monolithic setup allowed @yuvipanda to throw together a simple but very useful tool, 'watroles'. Watroles answered two questions:


Looking back: improvements to edit save time

Written by Gilles on Jun 12 2017, 9:32 AM in The speed of thought.

The WMF's financial year and its annual plan are coming to an end, and one of the Performance team's goals this past year was to reduce the amount of time it takes to save an edit on a wiki.


Improving time-to-logo performance with preload links

Written by Gilles on Jun 7 2017, 7:38 AM in The speed of thought.

One of the goals of the Wikimedia Performance Team is to improve the performance of MediaWiki and the broader software stack used on Wikimedia wikis. In this article we’ll describe a small performance improvement we’ve implemented for MediaWiki and recently deployed to production for Wikimedia. It highlights some of the unique problems we encounter on Wikimedia sites and how new web standards can be leveraged to improve performance.


#wikimedia-labs irc channel renamed to #wikimedia-cloud

Written by bd808 on Jun 5 2017, 3:11 PM in Clouds & Unicorns.

The first very visible step in the plan to rename things away from the term 'labs' happened around 2017-06-05 15:00Z when IRC admins made the #wikimedia-labs irc channel on Freenode invite-only and setup an automatic redirect to the new #wikimedia-cloud channel.


Status update (June 3rd, 2017)

Written by Halfak on Jun 3 2017, 8:24 PM in Score all the things.

Updates now coming to the phame blog! We made presentations and gathered new collaborators at the Wikimedia Hackathon 2017 in Vienna. ORES is back in api.php. Wikilabels has stats. ORES in CODFW fell over for a while, but it's back.


Join my Reddit AMA about Wikipedia and ethical, transparent AI

Written by Halfak on Jun 3 2017, 6:35 PM in Score all the things.

I wanted to let you know about an upcoming experimental Reddit AMA ("ask me anything") chat we have planned. It will focus on artificial intelligence on Wikipedia and how we're working to counteract vandalism while also making life better for newcomers.


Status update (April 14th, 2017)

Written by Halfak on Jun 3 2017, 6:30 PM in Score all the things.

In this update, I'm going to change some things up to try and make this update easier for you to consume. The biggest change you'll notice is that I've broken up the [#] references in each section. I hope that saves you some scrolling and confusion. You'll also notice that I have changed the subject line from "Revision scoring" to "Scoring Platform" because it's now clear that, come July, I'll be leading a new team with that name at the Wikimedia Foundation. There'll be an announcement about that coming once our budget is finalized. I'll try to keep this subject consistent for the foreseeable future so that your email clients will continue to group the updates into one big thread.


AI Wishlist initialized and a new Phab Tag (January 31st, 2017)

Written by Halfak on Jun 3 2017, 6:21 PM in Score all the things.

I hosted the AI Wishlist session at the Developer Summit(T147710). At that session, we brainstormed a set of AIs that we think would be interesting to implement. Generally I asked people to do their best to follow template that would help us remember why the AI was important, what it would help with, and what resources might help get it implemented. See artificial-intelligence


Deployment of ORES review tool in Englis Wikipedia as a beta feature (August 23rd, 2016)

Written by Halfak on Jun 3 2017, 4:51 PM in Score all the things.

We The Revision Scoring Team
are happy to announce the deployment of the ORES review tool as a beta feature on *English Wikipedia*. Once enabled, ORES highlights edits that are likely to be damaging in Special:RecentChanges, Special:Watchlist and Special:Contributions to help you prioritize your patrolling work. ORES detects damaging edits using a basic prediction model based on past damage.


Investigating a performance improvement

Written by Gilles on Jun 2 2017, 10:02 AM in The speed of thought.

Last week @Jdlrobson pinged me by email about a performance improvement his team noticed for large wiki articles on the mobile site in our synthetic tests run on WebPageTest. The improvement looked like this, a sudden drop in SpeedIndex (where lower is better):


New feature: Embed videos from Commons into Phabricator markup

Written by mmodell on Jun 1 2017, 11:49 PM in Doing the needful.

I just finished deploying an update to Phabricator which includes a simple but rather useful feature:


Updated `webservice` command deployed

Written by bd808 on May 31 2017, 8:18 PM in Clouds & Unicorns.

The v0.37 build of rOSTW operations-software-tools-webservice has been deployed to Toolforge hosts and Tools-Kubernetes Docker images.


Project-wide sudo policies in Horizon

Written by Andrew on May 30 2017, 8:02 PM in Clouds & Unicorns.

When @Ryan_Lane first built OpenStackManager and Wikitech, one of the first features he added was an interface to setup project-wide sudo policies via ldap.


Manage Instance on Horizon (only)

Written by Andrew on May 26 2017, 7:40 PM in Clouds & Unicorns.

For nearly a year, Horizon has supported instance management. It is altogether a better tool than the Special:NovaInstance page on Wikitech -- Horizon provides more useful status information for VMs, and has much better configuration management (for example changing security groups for already-running instances.)


Experimental Debian Stretch image now available

Written by Andrew on May 25 2017, 5:47 PM in Clouds & Unicorns.

I've just installed a new public base image, ' debian-9.0-stretch (experimental)' and made it available for all projects. It should appear in the standard 'Source' UI in Horizon any time you create a new VM.


Labs Openstack upgrade on Tuesday, 2016-08-02, 16:00 UTC

Written by Andrew on Aug 1 2016, 5:01 PM in Clouds & Unicorns.

Andrew will be upgrading our Openstack install from version 'Kilo' to version 'Liberty' on Tuesday the 2nd. The upgrade is scheduled to take up to three hours. Here's what to expect:


Sponsored Phabricator Improvements

Written by mmodell on Jul 27 2016, 10:44 AM in Doing the needful.

In T135327, the WMF Technical Collaboration team collected a list of Phabricator bugs and feature requests from the Wikimedia Developer Community. After identifying the most promising requests from the community, these were presented to Phacility (the organization that builds and maintains Phabricator) for sponsored prioritization.


Labs is auditing and removing inactive projects.

Written by chasemp on Jul 8 2016, 4:20 PM in Clouds & Unicorns.

If you are exclusively a user of tool labs, you can ignore this post. If you use or administer another labs project, this REQUIRES ACTION ON YOUR PART.


Kubernetes Webservice Backend Available for PHP webservices

Written by chasemp on Jul 8 2016, 4:19 PM in Clouds & Unicorns.

The Kubernetes ('k8s') backend for Tool Labs webservices is open to
beta testers from the community as a replacment for Grid Engine


Community Consultation on Labs Terms of Use: Round 1

Written by chasemp on May 26 2016, 6:55 PM in Clouds & Unicorns.

The Wikimedia Legal team is interested in revising, updating, and clarifying the existing Labs Terms of Use governing developers and their projects on labs.


Code Review Office Hours

Written by mmodell on May 9 2016, 9:50 PM in Doing the needful.

Starting Thursday May 12th, 13:00 PDT ( 20:00 GMT ) we will be having the first weekly Code Review office hours on freenode IRC in the #wikimedia-codereview channel.


Horizon is now the best UI for Labs/Tools

Written by chasemp on Apr 4 2016, 9:35 PM in Clouds & Unicorns.

horizon (OpenStack Dashboard) is the canonical implementation of OpenStack’s Dashboard, which provides a web based user interface to OpenStack services.


New bastion at

Written by chasemp on Apr 4 2016, 7:51 PM in Clouds & Unicorns. is on a new bastion host with twice
the RAM and CPU of the old one. This should hopefully provide a better
bandaid against it getting overloaded up. More discussion about a
longer term solution at


Kubernetes to 1.2 on Tuesday, 2016-04-05

Written by chasemp on Apr 1 2016, 5:30 PM in Clouds & Unicorns.

On Tuesday, 2016-04-05, we'll be upgrading Kubernetes to 1.2 and using
a different deployment method as well. While this should have no user
facing impact (ideally!) the following things might be flaky for a
period of time on that day:


What's new: Lots of improvements on

Written by mmodell on Feb 23 2016, 12:23 AM in Doing the needful.

Not a lot has changed for Wikimedia's instance of Phabricator over the past few months. That's because a lot has been happening behind the scenes, as well as upstream at Phacility. Members of the Release-Engineering-Team and Team-Practices group have been working since December 2015 to integrate various upstream changes, however, nothing was released to our production instance because there were so many important features that were in-progress and not yet fully usable. Additionally, we had to figure out exactly how these features would fit with the specific needs of our project and test a lot of functionality to be sure that we would not break anyone's workflows.