How’d we do in our strive for operational excellence last month? Read on to find out!
- Month in numbers.
- Highlighted stories.
- Current problems.
📊 Month in numbers
- 4 documented incidents in January 2019. 
- 16 Wikimedia-prod-error tasks closed. 
- 17 Wikimedia-prod-error tasks created. 
*️⃣ Unable to move certain file pages
Xiplus reported that renaming a File page on zh.wikipedia.org led to a fatal database exception. Andre Klapper identified the stack trace from the logs, and Brad (@Anomie) investigated.
The File renaming failed because the File page did not have a media file associated with it (such move action is not currently allowed in MediaWiki). But, while handling this error the code caused a different error. The impact was that the user didn't get informed about why the move failed. Instead, they received a generic error page about a fatal database exception.
*️⃣ DBPerformance regression detected and fixed
During a routine audit of Logstash dashboards, I found a DBPerformance warning. The warning indicated that the limit of 0 for “master connections” was violated. That's a cryptic way of saying it found code in MediaWiki that uses a database master connection on a regular page view.
MediaWiki can have many replica database servers, but there can be only one master database at any given moment. To reduce chances of overload, delaying edits, or network congestion; we make sure to use replicas whenever possible. We usually involve the master only when source data is being changed, or is about to be changed. For example, when editing a page, or saving changes.
As the vast majority of traffic is page views, we have lower thresholds for latency and dependency on page views. In particular, page views may (in the future) be routed to secondary data centres that don’t even have a master DB.
*️⃣ TemplateData missing in action
@Tacsipacsi and @Evad37 both independently reported the same TemplateData issue. TemplateData powers the template insertion dialog in VisualEditor. It wasn't working for some templates after we deployed the 1.33-wmf.13 branch.
The error was “Argument 1 passed to ApiResult::setIndexedTagName() must be an instance of array, null given”. This means there was code that calls a function with the wrong parameter. For example, the variable name may've been misspelled, or it may've been the wrong variable, or (in this case) the variable didn't exist. In such case, PHP implicitly assumes “null”.
Bartosz (@matmarex) found the culprit. The week before, I made a change to TemplateData that changed the “template parameter order” feature to be optional. This allows users to decide whether VisualEditor should force an order for the parameters in the wikitext. It turned out I forgot to update one of the references to this variable, which still assumed it was always present.
Brad (Anomie) fixed it later that week, and it was deployed the next day. Thanks! — T213953
📈 Current problems
Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.
There are currently 188 open Wikimedia-prod-error tasks as of 12 February 2019. (We’ve had a slight increase since November; 165 in December, 172 in January.)
For this month’s edition, I’d like to draw attention to a few older issues that are still reproducible:
- [2013; Collection extension] Special:Book fatal error for blocked users. T56179
- [2013; CentralNotice] Fatal error when placeholder key contains a space. T58105
- [2014; LQT] Fatal error when attempting to view certain threads. T61791
- [2015; MassMessage] Warning about Invalid message parameters. T93110
- [2015; Wikibase] Warning “UnresolvedRedirectException” for some pages on Wikidata (and Commons). T93273
Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: @A2093064‚ @Anomie, @Daimona @Gilles, @He7d3r, @Jdforrester-WMF, @matmarex, @mmodell, @Nikerabbit, @Catrope, @Tchanders, @Tgr, and @thiemowmde.
Until next time,
— Timo Tijhof
👢There's a snake in my boot. Reach for the sky!
 Incidents. — wikitech.wikimedia.org/wiki/Special:AllPages…
 Tasks closed. — phabricator.wikimedia.org/maniphest/query…
 Tasks created. — phabricator.wikimedia.org/maniphest/query…