Page MenuHomePhabricator

Ensure flood of hard-deprecations are caught during (train) deployments
Closed, ResolvedPublic

Description

This task is to re-evaluate after T252906, and T251278.

Problem

Deprecation warnings from MediaWiki are currently not seen by Scap canary-checker, Scap local prepromote checks, or other reguluar production monitoring (e.g. Icgina MW-error alerts).

This means it is relatively easy for these to slip into production unnoticed until group2 where they could overwhelm/saturate Logstash capacity and only trigger alerts there, which is way too late. Or if they are only triggered by a subset of requests, then it's quite likely it would not bring down Logstash until there are multiple such regressions accumulated.

Good stuff:

  • We already catch deprecation warnings in our CI pipeline if they are triggered by PHP code covered by tests of any kind (PHPUnit, Selenium etc.). Our CI fails the build for deprecation warnings the same way it would for any other PHP notice, warning, error, or fatal exception.
  • The Logstash channel:deprecation channel is nearly always empty. This means we have a clean slate and a signal-to-noise ratio at or approaching infinity considering hard-deprecations in production are extremely likely to be regressions and generally relatively easy to mitigate.
Objective

Hard deprecations emitted in production should be noticed by:

  • Scap's prepromote check, which asserts the PHP stderr from local mwscript invocation to be empty.
  • Scaps' canary checker, which currenly monitors the Logstash query for type:mediawiki channel:(exception OR error).
  • The mediawiki-new-errors Kibana dashboard.
Proposal

Basically we can either update a bunch of queries, or we can do what we already do in PHPUnit and what PHP natively already does for its own deprecations, which is to emit them as PHP Warnings. This means they naturally end up in channel:error and in the CLI stderr, similar to messages logged by wfLogWarning and messages from PHP natively (e.g. "PHP Notice: Undefined variable").

I suggest the latter.

Event Timeline

Krinkle created this task.May 15 2020, 10:47 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 15 2020, 10:47 PM
Krinkle updated the task description. (Show Details)
Krinkle added a subscriber: DannyS712.
Jdlrobson moved this task from Incoming to Tracking on the Readers-Web-Backlog board.
BPirkle triaged this task as Medium priority.May 19 2020, 8:30 PM

Change 599148 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] debug: Use native E_USER_DEPRECATED instead of custom channel

https://gerrit.wikimedia.org/r/599148

Krinkle claimed this task.EditedMay 28 2020, 2:15 AM

Decided to kickstart this as I had some spare time.

Will sign back over for code review once the patch is in better shape.

Krinkle moved this task from Inbox to Blocked or Needs-CR on the Performance-Team board.
Krinkle updated the task description. (Show Details)Jul 14 2020, 7:34 PM

Change 599148 merged by jenkins-bot:
[mediawiki/core@master] debug: Use native E_USER_DEPRECATED instead of custom channel

https://gerrit.wikimedia.org/r/599148

Krinkle closed this task as Resolved.Jul 20 2020, 6:57 PM