Page MenuHomePhabricator

Log DonationInterface fatal errors to syslog
Closed, ResolvedPublic2 Estimated Story Points

Description

This is a followup to T107827. Now that we know we're missing logs, find a way to get these messages to log to syslog and appear in payments.error. This probably needs to be done in MediaWiki core, either configuration or a patch to add configuration.

mw-core is catching our fatal error and logging it to the wfDebug stream. Not cool.

Outcome

  • Add config block to the beginning of LocalSettings.php, before any require statements.
  • Require monolog from composer, being careful to match the version in DI.
$defaultProcessors = array(
    'wiki' => array(
        'class' => 'MediaWiki\Logger\Monolog\WikiProcessor',
    ),
    'psr' => array(
        'class' => 'Monolog\Processor\PsrLogMessageProcessor',
    ),
    'pid' => array(
        'class' => 'Monolog\Processor\ProcessIdProcessor',
    ),
    'uid' => array(
        'class' => 'Monolog\Processor\UidProcessor',
    ),
    'web' => array(
        'class' => 'Monolog\Processor\WebProcessor',
    ),
);
$syslogLogger = array(
    'handlers' => array( 'syslog' ),
    'processors' => array_keys( $defaultProcessors ),
);
$wgMWLoggerDefaultSpi = array(
    'class' => 'MediaWiki\Logger\MonologSpi',
    'args' => array( array(
        'formatters' => array(
            'line' => array(
                'class' => 'Monolog\Formatter\LineFormatter',
                'args' => array( '%message%' ),
            ),
        ),
        'handlers' => array(
            'syslog' => array(
                'class' => 'MediaWiki\Logger\Monolog\SyslogHandler',
                'args' => array( 'mediawiki', 'localhost', 514, LOG_USER,
                    // Although we only publish messages from the error streams,
                    // note that anything published by wfDebugLog is at the
                    // info level, thus the low bar.
                    Monolog\Logger::INFO,
                ),
                'formatter' => 'line',
            ),
            'blackhole' => array(
                'class' => 'Monolog\Handler\NullHandler',
            ),
        ),
        'loggers' => array(
            'exception' => $syslogLogger,
            'fatal' => $syslogLogger,

            // Throw out anything else.  Payments logging is already its
            // own thing, so this only includes MediaWiki logs, below error
            // level.
            '@default' => array( 'handlers' => array( 'blackhole' ) ),
        ),
        'processors' => $defaultProcessors,
    ), ),
);

Event Timeline

awight raised the priority of this task from to High.
awight updated the task description. (Show Details)
awight added subscribers: Jgreen, awight, Aklapper.

According to https://www.mediawiki.org/wiki/Manual:How_to_debug#Creating_custom_log_groups

we should add something like the following lines to the MediaWiki config file:

$wgDebugLogGroups = array(
	'exception' => '/var/log/mw/error.log',
	'error' => '/var/log/mw/error.log',
);

@Jgreen: can you confirm that this is useful? Will it be easy to sync this file into the shared directory?

Trying to rephrase my earlier comment to say something not insane. I'd like for us to have output just like fluorine:/a/mw-log/exception.log and fatal.log, and for that to be filemoved along with our other payments* logfiles.

Right now they're either syslogged-and-centralized (apache error logs, various php logs) or they log only locally, and are subject to local nightly rotation/expiration.

I think the best fix would be to figure out a way to get mediawiki to log via syslog directly or via apache/php logging, that would allow us to use existing realtime log centralization infrastructure.

Looks like this will be possible: https://www.mediawiki.org/wiki/Manual:MWLoggerMonologSpi

I'll try to point info and above at syslog.

WIP, you don't wanna ask... Crashes at the moment.

Removed WIP code, see task description for final version

Note: require monolog from composer.json

Side thing: on payments1004, we should find a way to echo orphan slayer logging to stdout.

awight renamed this task from Make DonationInterface fatal errors accessible to Log DonationInterface fatal errors to syslog.Aug 25 2015, 7:39 PM

@Jgreen
I just realized we'll need a new syslog bucketing rule for these fatal messages, but I don't see a simple pattern we can use to grep them. Here's an example from my local box,

Oct 16 09:25:04 hostname mediawiki: [d06a3d65] /index.php/Main_Page Exception from line 595 of /srv/mediawiki-core/includes/MediaWiki.php: FOO

Ordinarily we'd just use the "mediawiki:" tag and severity (which doesn't
show up in the log line, but is available for filtering).

Ya know... the config gives us control over the logline formatting. I can add whatever we need. Do you have a preferred log format?

How about something this, where 123 is the PID:

Oct 16 09:25:04 payments1001 mediawiki[123]: {arbitrary message text}

@Jgreen
Any clue why we're suddenly getting unrelated syslog spew in payments-misc as soon as I deployed this config change? I don't see how the PHP settings would have affected non-PHP apps.

e.g.

Oct 16 19:56:08 payments1004 /usr/sbin/gmond[6431]: Unable to find the metric information for 'mysql_innodb_ibuf_merged'. Possible that the module has not been loaded.#012
Oct 16 19:55:49 payments1004 puppet-agent[5236]: (/Stage[main]/Ganglia::Modules::Mysql/File[/etc/ganglia/conf.d/mysql.pyconf]/owner) owner changed 'ganglia' to 'root'

It comes from all 4 payments boxes. Freakish.

I rolled out a ganglia module to collect redis stats on Friday, and I
~think~ that's just normal logspew from puppet and the service restart.
Unless it keeps spewing . . . which would mean I broke ganglia. Either way
harmless, will check on it on Monday.

I rolled out a ganglia module to collect redis stats on Friday, and I
~think~ that's just normal logspew from puppet and the service restart.
Unless it keeps spewing . . . which would mean I broke ganglia. Either way
harmless, will check on it on Monday.

Confirmed ongoing logspew, wrangled horrible python code, tamed the logspew.

This task was in the done column in Venga Boys. Is it still in progress?

Yeah, this is actually blocked by T115746--until that subtask is deployed, we have nothing.

To verify: run the log tickler for old times' sake, then undeploy it.