HomePhabricator
Production Excellence #10: April 2019
Monthly update on our strive for operational excellence.

How’d we do in our strive for operational excellence last month? Read on to find out!

  • Month in numbers.
  • Highlighted stories.
  • Current problems.
📊 Month in numbers
  • 8 documented incidents. [1]
  • 30 new Wikimedia-prod-error tasks created. [2]
  • 31 Wikimedia-prod-error tasks closed. [3]

The number of incidents in April was relatively high at 8. Both compared to this year (4 in January, 7 in February, 8 in March), and compared to last year (4 in April 2018).

To read more about these incidents, their investigations, and conclusions; check wikitech.wikimedia.org/wiki/Incident_documentation#2019.

As of writing, there are 186 open Wikimedia-prod-error issues (up from 177 last month). [4]

📖 Rehabilitation of MediaWiki-DateFormatter

Following the report of a PHP error that happened when saving edits to certain pages, Tim Starling investigated. The investigation motivated a big commit that brings this class into the modern era. I think this change serves as a good overview of what’s changed in MediaWiki over the last 10 years, and demonstrates our current best practices.

Take a look at Gerrit change 502678 / T220563.

📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 2 issues left (unchanged).
  • December: 4 issues left (unchanged).
  • January: 1 issue got fixed. One last issue remaining (down from 2).
  • February: 2 issues were fixed. Another 3 issues remaining (down from 5).
  • March: 5 issues were fixed. Another 5 issues remaining (down from 10).
  • April: 14 new issues were found last month that remain unresolved.

By steward and software component, issues left from March and April:

  • Anti-Harassment / User blocking: T222170
  • CPT / Revision-backend (Save redirect pages): T220353
  • CPT / Revision-backend (Import a page): T219702
  • CPT / Revision-backend (Export pages for dumps): T220160
  • Growth / Watchlist: T220245
  • Growth / Page deletion (Restore an archived page): T219816
  • Growth / Page deletion (File pages): T222691
  • Growth / Echo (Job execution): T217079
  • Multimedia / File management (Upload mime error): T223728
  • Performance / Deferred-Updates: T221577
  • Search Platform / CirrusSearch (Job execution): T222921
  • (Unstewarded) / Page renaming: T223175, T221763, T221595

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: @aaron, @ArielGlenn, @Daimona, @dcausse, @EBernhardson, @Jdforrester-WMF, @Joe, @KartikMistry, @Ladsgroup, @Lucas_Werkmeister_WMDE, @MaxSem, @MusikAnimal, @Mvolz, @Niharika, @Nikerabbit, @Pchelolo, @pmiazga, @Reedy, @SBisson, @tstarling, and @Umherirrender.

Thanks!

Until next time,

– Timo Tijhof

🏴‍☠️ “One good deed is not enough to save a man.” “Though it seems enough to condemn him?” “Indeed…

Footnotes:

[1] Incidents reports by month and year. –
codepen.io/Krinkle/…

[2] Tasks created. –
phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. –
phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. –
phabricator.wikimedia.org/maniphest/query…

Written by Krinkle on May 31 2019, 7:21 PM.
Principal Engineer (Wikimedia Performance)
Projects
None
Subscribers
Reedy, Joe, MaxSem and 18 others
Tokens
"Like" token, awarded by Quiddity.

Event Timeline