Production Excellence #11: May 2019
Monthly update on our strive for operational excellence.

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 6 documented incidents. [1]
  • 41 new Wikimedia-prod-error tasks created. [2]
  • 36 Wikimedia-prod-error tasks closed. [3]

The number of incidents in May of this year was comparable to previous years (6 in May 2019, 2 in May 2018, 5 in May 2017), and previous months (6 in May, 8 in April, 8 in March) – comparisons at CodePen.

To read more about these incidents, their investigations, and pending actionables; check wikitech.wikimedia.org/wiki/Incident_documentation#2019.

As of writing, there are 201 open Wikimedia-prod-error tasks (up from 186 last month). [4]


📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 2 issues left (unchanged).
  • December: 1 issue got fixed. 3 issues left (down from 4).
  • January: 1 issue left (unchanged).
  • February: 2 issues left (unchanged).
  • March: 1 issue got fixed. 4 issues remaining (down from 5).
  • April: 2 issues got fixed. 12 issues remain unresolved (down from 14).
  • May: 10 new issues found last month survived the month of May, and remain unresolved.

By steward and software component, unresolved issues from April and May:

  • Wikidata / Lexeme (API query fatal): T223995
  • Wikidata / WikibaseRepo (API Fatal hasSlot): T225104
  • Wikidata / WikibaseRepo (Diff link fatal): T224270
  • Wikidata / WikibaseRepo (Edit undo fatal): T224030
  • Growth / Echo (Notification storage): T217079
  • Growth / Flow (Topic link fatal): T224098
  • Growth / Page deletion (File pages): T222691
  • Multimedia or CPT / API (Image info fatal): T221812
  • CPT / PHP7 refactoring (File descriptions): T223728
  • CPT / Title refactor (Block log fatal): T224811
  • CPT / Title refactor (Pageview fatals): T224814
  • (Unstewarded) Page renaming: T223175, T205675
💡Ideas: To suggest an investigation to write about in a future edition, contact me by e-mail, or private message on IRC.

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production.

Until next time,

– Timo Tijhof

🎙“It’s not too shabby is it?

Footnotes:

[1] Incidents. –
wikitech.wikimedia.org/wiki/Special:PrefixIndex…

[2] Tasks created. –
phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. –
phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. –
phabricator.wikimedia.org/maniphest/query…

Written by Krinkle on Jul 1 2019, 6:56 PM.
Principal Engineer (Wikimedia Performance)
Projects
None
Subscribers
None
Tokens
"Like" token, awarded by Quiddity.

Event Timeline