Production Excellence #10: April 2019
Monthly update on our strive for operational excellence.

How’d we do in our strive for operational excellence last month? Read on to find out!

  • Month in numbers.
  • Highlighted stories.
  • Current problems.
📊 Month in numbers
  • 8 documented incidents. [1]
  • 30 new Wikimedia-prod-error tasks created. [2]
  • 31 Wikimedia-prod-error tasks closed. [3]

The number of incidents in April was relatively high at 8. Both compared to this year (4 in January, 7 in February, 8 in March), and compared to last year (4 in April 2018).

To read more about these incidents, their investigations, and conclusions; check wikitech.wikimedia.org/wiki/Incident_documentation#2019.

As of writing, there are 186 open Wikimedia-prod-error issues (up from 177 last month). [4]

📖 Rehabilitation of MediaWiki-DateFormatter

Following the report of a PHP error that happened when saving edits to certain pages, Tim Starling investigated. The investigation motivated a big commit that brings this class into the modern era. I think this change serves as a good overview of what’s changed in MediaWiki over the last 10 years, and demonstrates our current best practices.

Take a look at Gerrit change 502678 / T220563.

📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.


Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 2 issues left (unchanged).
  • December: 4 issues left (unchanged).
  • January: 1 issue got fixed. One last issue remaining (down from 2).
  • February: 2 issues were fixed. Another 3 issues remaining (down from 5).
  • March: 5 issues were fixed. Another 5 issues remaining (down from 10).
  • April: 14 new issues were found last month that remain unresolved.

By steward and software component, issues left from March and April:

  • Anti-Harassment / User blocking: T222170
  • CPT / Revision-backend (Save redirect pages): T220353
  • CPT / Revision-backend (Import a page): T219702
  • CPT / Revision-backend (Export pages for dumps): T220160
  • Growth / Watchlist: T220245
  • Growth / Page deletion (Restore an archived page): T219816
  • Growth / Page deletion (File pages): T222691
  • Growth / Echo (Job execution): T217079
  • Multimedia / File management (Upload mime error): T223728
  • Performance / Deferred-Updates: T221577
  • Search Platform / CirrusSearch (Job execution): T222921
  • (Unstewarded) / Page renaming: T223175, T221763, T221595

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: @aaron, @ArielGlenn, @Daimona, @dcausse, @EBernhardson, @Jdforrester-WMF, @Joe, @KartikMistry, @Ladsgroup, @Lucas_Werkmeister_WMDE, @MaxSem, @MusikAnimal, @Mvolz, @Niharika, @Nikerabbit, @Pchelolo, @pmiazga, @Reedy, @SBisson, @tstarling, and @Umherirrender.


Until next time,

– Timo Tijhof

🏴‍☠️ “One good deed is not enough to save a man.” “Though it seems enough to condemn him?” “Indeed…


[1] Incidents reports by month and year. –

[2] Tasks created. –

[3] Tasks closed. –

[4] Open tasks. –

Written by Krinkle on May 31 2019, 7:21 PM.
Principal Engineer (Wikimedia Performance)
Reedy, Joe, MaxSem and 18 others
"Like" token, awarded by Quiddity.

Event Timeline