Page MenuHomePhabricator

ship logs (apache, php-fpm) from doc machines to logstash
Closed, ResolvedPublic

Description

This came up on IRC, in context of T322357 and upgrading doc1002 from PHP 7.3 to PHP 7.4+. Krinkle wanted to check for errors in php-fpm logs on doc machines but did not have shell access to do so.

That made me think we also want to ship those to Logstash, just like we have tickets to do that for other services that don't have it yet.

Semi-relatedly there is a question about the log levels, because currently there is:

php.ini
error_reporting = E_ALL & ~E_DEPRECATED & ~E_STRICT

But in this case we do NOT want to exclude DEPRECATED, while most of the time we did want to exclude it.

That is a little unrelated though. This ticket should focus on shipping the logs we want by default to logstash so users without shell can look at them.

Event Timeline

Krinkle updated the task description. (Show Details)

Here is an example how this was done for a "misc apache" on the miscweb machines:

  1. make sure the apache logs are sent into the local syslog (via rsyslog), example change https://gerrit.wikimedia.org/r/c/operations/puppet/+/848547
  1. configure that they are being forwarded from there to kafka, example change: https://gerrit.wikimedia.org/r/c/operations/puppet/+/849169
LSobanski triaged this task as Medium priority.Jan 9 2023, 4:28 PM
LSobanski moved this task from Incoming to Backlog on the collaboration-services board.

Change 900369 had a related patch set uploaded (by EoghanGaffney; author: EoghanGaffney):

[operations/puppet@production] Allow E_DEPRECATED logs to be shown on php-fpm in doc machines

https://gerrit.wikimedia.org/r/900369

Change 900375 had a related patch set uploaded (by EoghanGaffney; author: EoghanGaffney):

[operations/puppet@production] Adds php and apache logs for doc machines

https://gerrit.wikimedia.org/r/900375

Change 900410 had a related patch set uploaded (by EoghanGaffney; author: EoghanGaffney):

[operations/puppet@production] Add doc host apache/php-fpm logs to kafka

https://gerrit.wikimedia.org/r/900410

We should probably have "ship to logstash" tickets for any service in our service matrix doc that isn't confirmed to have it yet.

Change 900369 merged by EoghanGaffney:

[operations/puppet@production] Allow E_DEPRECATED logs to be shown on php-fpm in doc machines

https://gerrit.wikimedia.org/r/900369

Change 903239 had a related patch set uploaded (by EoghanGaffney; author: EoghanGaffney):

[operations/puppet@production] Set log format to ecs on doc hosts

https://gerrit.wikimedia.org/r/903239

Change 903239 merged by EoghanGaffney:

[operations/puppet@production] Set log format to ecs on doc hosts

https://gerrit.wikimedia.org/r/903239

Change 900375 merged by EoghanGaffney:

[operations/puppet@production] Adds php and apache logs for doc machines

https://gerrit.wikimedia.org/r/900375

Change 900410 merged by EoghanGaffney:

[operations/puppet@production] Add doc host apache/php-fpm logs to kafka

https://gerrit.wikimedia.org/r/900410

These logs are now formatted as ECS (structured) logs and are being ingested into logstash.

The access log can be found in the Apache ECS access log dashboard by filtering on url.domain: doc.wikimedia.org and I have confirmed with @eoghan that accesses indeed reach logstash.

Note: doc.wikimedia.org is behind ATS/Varnish and thus only uncached requests reaches Apache (ie most are served by the front end cache and thus never reaches Apache. To inspect those one would have to check the webaccess data set on Turnillo (which is private).

For the fpm errors, I am assuming they end up in syslog and would show up on the syslog dashboard. I don't have access to the raw files but they are rather quiet:

$ ls -lrta /var/log/php*fpm*
-rw------- 1 root root  73 Jan  1 00:00 /var/log/php7.3-fpm.log.12.gz
-rw------- 1 root root 250 Jan 12 10:41 /var/log/php7.3-fpm.log.11.gz
-rw------- 1 root root  73 Jan 15 00:00 /var/log/php7.3-fpm.log.10.gz
-rw------- 1 root root  73 Jan 22 00:00 /var/log/php7.3-fpm.log.9.gz
-rw------- 1 root root  73 Jan 29 00:00 /var/log/php7.3-fpm.log.8.gz
-rw------- 1 root root  73 Feb  5 00:00 /var/log/php7.3-fpm.log.7.gz
-rw------- 1 root root  73 Feb 12 00:00 /var/log/php7.3-fpm.log.6.gz
-rw------- 1 root root 219 Feb 22 16:23 /var/log/php7.3-fpm.log.5.gz
-rw------- 1 root root 225 Feb 28 20:50 /var/log/php7.3-fpm.log.4.gz
-rw------- 1 root root 223 Mar  9 15:45 /var/log/php7.3-fpm.log.3.gz
-rw------- 1 root root 190 Mar 15 08:40 /var/log/php7.3-fpm.log.2.gz
-rw------- 1 root root 339 Mar 25 07:54 /var/log/php7.3-fpm.log.1
-rw------- 1 root root  56 Mar 26 00:00 /var/log/php7.3-fpm.log

The most common warning on doc1002 in Logstash appears to be this one:

PHP message: PHP Warning:  array_key_exists(): The first argument should be either a string or an integer in /srv/doc/mediawiki-core/master/php/search_opensearch.php on line 99

This is actually not a deprecation warning but a long-standing bug in the OpenSearch implementation of Doxygen. This isn't visible in the user interface by default. But, if you add https://doc.wikimedia.org/mediawiki-core/master/php/ as a search engine in your browser and search for something, the autocomplete suggestions always said "1 result" even if there were multiple. Note that Doxygen does actually return all relevant suggestions, its just the label alongside it incorrectly summarises it as "1 result".

I've submitted a fix upstream to https://github.com/doxygen/doxygen/issues/10017.