Page MenuHomePhabricator

Ensure flood of hard-deprecations are caught during (train) deployments
Closed, ResolvedPublic

Description

This task is to re-evaluate after T252906, and T251278.

Problem

Deprecation warnings from MediaWiki are currently not seen by Scap canary-checker, Scap local prepromote checks, or other reguluar production monitoring (e.g. Icgina MW-error alerts).

This means it is relatively easy for these to slip into production unnoticed until group2 where they could overwhelm/saturate Logstash capacity and only trigger alerts there, which is way too late. Or if they are only triggered by a subset of requests, then it's quite likely it would not bring down Logstash until there are multiple such regressions accumulated.

Good stuff:

  • We already catch deprecation warnings in our CI pipeline if they are triggered by PHP code covered by tests of any kind (PHPUnit, Selenium etc.). Our CI fails the build for deprecation warnings the same way it would for any other PHP notice, warning, error, or fatal exception.
  • The Logstash channel:deprecation channel is nearly always empty. This means we have a clean slate and a signal-to-noise ratio at or approaching infinity considering hard-deprecations in production are extremely likely to be regressions and generally relatively easy to mitigate.
Objective

Hard deprecations emitted in production should be noticed by:

  • Scap's prepromote check, which asserts the PHP stderr from local mwscript invocation to be empty.
  • Scaps' canary checker, which currenly monitors the Logstash query for type:mediawiki channel:(exception OR error).
  • The mediawiki-new-errors Kibana dashboard.
Proposal

Basically we can either update a bunch of queries, or we can do what we already do in PHPUnit and what PHP natively already does for its own deprecations, which is to emit them as PHP Warnings. This means they naturally end up in channel:error and in the CLI stderr, similar to messages logged by wfLogWarning and messages from PHP natively (e.g. "PHP Notice: Undefined variable").

I suggest the latter.

Event Timeline

Change 599148 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] debug: Use native E_USER_DEPRECATED instead of custom channel

https://gerrit.wikimedia.org/r/599148

Decided to kickstart this as I had some spare time.

Will sign back over for code review once the patch is in better shape.

Change 599148 merged by jenkins-bot:
[mediawiki/core@master] debug: Use native E_USER_DEPRECATED instead of custom channel

https://gerrit.wikimedia.org/r/599148

@Krinkle This makes it very hard for translatewiki.net to monitor deprecation warnings, because we have so many (on every request) that it would flood the error logs and the relays. Ideally I would like to either log these to a different channel, or allow suppression of known, reported deprecations. Right now I can only do it manually by rising/lowering $wgDeprecationReleaseLimit (which we have forgotten at 1.30 for a long time) and monitoring the log volume.

I think $wgDeprecationReleaseLimit should still work the same as it has before. If stuff is now coming through in ways it didn't before, that's a bug. Would need more details to be of help but I'm up for fixing that if the case. In the interim, you might be able to use error_reporting to exclude E_USER_DEPRECATED, which should do the same thing, without needing to configure MW. That's one of the reasons I made the changes here that I did, so that we integrate and alllow use of these capabilities.

To log these to a separate system we do still the onLogException hook. From there one could do someting like if ( $e instanceof ErrorException && $e->getSeverity() === E_USER_DEPRECATED ) and then either directly write to the alternate file or service, or use LoggerFactory to send a message to an already configured channel, perhaps even named "deprecated".

Naturally from here one could also be more picky about what is logged, e.g. an match against a multi-line regex, or loop through an array strings to strpos check and discard.

Change 693027 had a related patch set uploaded (by Reedy; author: Krinkle):

[mediawiki/core@REL1_35] debug: Use native E_USER_DEPRECATED instead of custom channel

https://gerrit.wikimedia.org/r/693027

Change 693027 abandoned by Reedy:

[mediawiki/core@REL1_35] debug: Use native E_USER_DEPRECATED instead of custom channel

Reason:

https://gerrit.wikimedia.org/r/693027