Production Excellence #9: March 2019
Monthly update on our strive for operational excellence.

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 8 documented incidents. [1]
  • 31 new Wikimedia-prod-error issues reported. [2]
  • 28 Wikimedia-prod-error issues closed. [3]

The number of incidents this month was slightly above average compared to earlier this year (7 in February, 4 in January), and this time last year (4 in March 2018, 7 in February 2018).

To read more about these incidents, their investigations, and conclusions, check wikitech.wikimedia.org/wiki/Incident_documentation#2019-03.

There are currently 177 open Wikimedia-prod-error issues, similar to last month. [4]

💡 Ideas: To suggest an investigation to highlight in a future edition, feel free contact me by e-mail, or private message on IRC.

📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.


Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • September: Done! The last two issues were resolved.
  • October: Done! The last issue was resolved.
  • November: 2 issues left (from 1.33-wmf.2). 1 issue was fixed.
  • December: 4 issues left (from 1.33-wmf.9). 1 issue was fixed.
  • January: 2 issues left (1.33-wmf.13 – 14). 1 issue was fixed.
  • February: 5 issues (1.33-wmf.16 – 19).
  • March: 10 new issues (1.33-wmf.20 – 23).

By steward and software component, for issues remaining from February and March:

🎉 Thanks!

Thanks to @aaron, @Anomie, @Arlolra, @Daimona, @hashar, @Jdforrester-WMF, @kostajh, @matmarex, @MaxSem, @Niedzielski, @Nikerabbit, @Petar.petkovic, @santhosh, @ssastry, @Umherirrender, @WMDE-leszek, @zeljkofilipin, and everyone else who helped last month by reporting, investigating, or patching errors found in production!

Until next time,

– Timo Tijhof

🦅 “This isn’t flying. This is falling… with style!


[1] Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex/Incident_documentation/201903 …

[2] Tasks created. – phabricator.wikimedia.org/maniphest/query …

[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query …

[4] Open tasks. – phabricator.wikimedia.org/maniphest/query …

Written by Krinkle on Apr 21 2019, 6:51 PM.
Principal Engineer (Wikimedia Performance)
hashar, MaxSem, Jdforrester-WMF and 14 others
"Party Time" token, awarded by zeljkofilipin."Like" token, awarded by abi_.

Event Timeline