Page MenuHomePhabricator

Make logging work for mediawiki in k8s
Open, MediumPublic

Description

Placeholder task since there doesn't appear to be one yet.

Stuff that comes to mind:

  • MediaWiki PSR/Monolog logs go to Logstash under type:mediawiki. This is for exception, fatals, and various diagnostic channels.
  • MediaWiki logs go to mwlog1001 files. (Test plan: XWD api.php queries go only to api.log; fatal-error.php hits go to both Logstash and mwlog.)
  • php-wmerrors fatal errors go from /etc/php/php7-fatal-error.php to Logstash under type:mediawiki channel:exception caught_by:php-wmerrors.
  • php-fpm stderr go to Logstash under type:syslog program:php7.2-fpm.

Event Timeline

fgiunchedi triaged this task as Medium priority.Mon, Aug 30, 8:02 AM

Using fatal-error.php I determined that at the moment logging to mwlog1001 works, while it seems that we're not able to log to logstash. I strongly suspect this is due to some missing egress rules.

Other than that:

  • We need to add /etc/php/php7-fatal-error.php to the mediawiki image
  • php stderr currently gets displayed in the response to the request, which is doubly wrong.

@Joe @Krinkle What's the reason php7-fatal-error.php is in /etc/php (via operations/puppet) and not in operations/mediawiki-config ?

@dancy TLDR: It could probably be moved, and I'll ramble a bit about what I currently understand, some of which you know already, and these may or may not be a good reasons for the status quo.

  • The file is logically executed outside MediaWiki context, referenced from the C code in the native php-wmerrors extension for PHP. It is not invoked or referenced anywhere by "us" at runtime, and may not refer to anything from MediaWiki, multiversion, or wmf-config.
  • The php-wmerrors file is only for "really bad" errors. The vast majority of errors are sent to syslog by MediaWiki/Monolog, which then go to rsyslog/kafka/logstash. The php-wmerrors file mimics Monolog's syslog messages for edge cases where PHP is unable to let the application report the error, and instead falls back to php-wmerrors.
  • I think the idea is also that the script should be standalone but yet discover certain settings and services to send information to, which are perhaps more reliable to inject at "build time" through ERB with puppet.
  • Also touching on the idea that it might be considered an anti-pattern for manifests to provision a server with software and settings that refer to files that aren't ensured by that same manifest. Having said that, we do kind of do this already for Apache which refers to /w/index.php and /w/robots.php and those seem quite natural/required/unavoidable. I suppose the PHP extension feel more standalone and abstractable. It can be removed without affecting MW in any way, perhaps emotionally closer to how we provision Envoy, Mcrouter, and other software external to MW.

Having said that, we technically could move it to wmf-config, and could find an alternative way to discover Statsd. Possibly through an environment variable. Another way might be to consume the *Services.php files, but that would imho muddy the waters and be potentially more risky. php-wmerrors' primary purpose is to be the last part standing in the face of severe errors that couldn't be handled at any other layer, so it "working" in the common case would not be important because in the common case it isn't actually invoked.

The only real reason why we've used puppet there was to inject the statsd address easily IIRC.

@Krinkle we do already include a "params" file, I think we can just keep including it in a special directory in mediawiki-config.

Change 721333 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/docker-images/production-images@master] Add configuration for wmerrors to php-multiversion-base

https://gerrit.wikimedia.org/r/721333

Change 721341 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] mediawiki: allow injecting the wmerrors script

https://gerrit.wikimedia.org/r/721341

Change 721342 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] mediawiki::web::yaml_defs: inject php7-0fatal-error.php in k8s

https://gerrit.wikimedia.org/r/721342

The set of patches above should allow us to get wmerrors working; we can work on moving php7-fatal-error.php to mediawiki-config separately.

Change 721333 merged by Giuseppe Lavagetto:

[operations/docker-images/production-images@master] Add configuration for wmerrors to php-multiversion-base

https://gerrit.wikimedia.org/r/721333

Coming to logstash: right now on bare metal we rely the logs to rsyslogd talking to it via TCP on localhost. This is not possible on kubernetes, unless we install a sidecar that acts like a syslog relay to actually relay the logs to the physical node's rsyslogd.

An alternative worth exploring is making MonoLog log to stderr via something like e.g. error_log, but that needs some investigation. From IRC: we currently have customised our Syslog handler quite a bit so we would need to adapt the monolog StreamHandler accordingly.

Joe updated the task description. (Show Details)
Joe updated the task description. (Show Details)
Joe updated the task description. (Show Details)

There is also monolog ErrorLogHandler which might be more idiomatic, but we'll have to see if one has notable benefits over the other in terms of overhead, reliability, or feature-compatibility.

via TCP on localhost.

UDP not TCP, (I am just being pedantic, I know).

Change 721341 merged by Giuseppe Lavagetto:

[operations/deployment-charts@master] mediawiki: allow injecting the wmerrors script

https://gerrit.wikimedia.org/r/721341

Change 721342 merged by Giuseppe Lavagetto:

[operations/puppet@production] mediawiki::web::yaml_defs: inject php7-fatal-error.php in k8s

https://gerrit.wikimedia.org/r/721342