Page MenuHomePhabricator

Format Monolog messages to be easier to grep, bucket them
Closed, ResolvedPublic1 Story Points

Description

Bug is resolved once MediaWiki errors are bucketed into a new logfile and replicated with the others.

Include some more information in the logline, like @Jgreen suggests in T107918:

Oct 16 09:25:04 payments1001 mediawiki[123]: {arbitrary message text}

Implementation

Set logging configuration in LocalSettings.php:

$wgMWLoggerDefaultSpi = array(
  'class' => 'MediaWiki\Logger\MonologSpi',
  'args' => array( array(
  'formatters' => array(
      'line' => array(
      'class' => 'Monolog\Formatter\LineFormatter',
-     'args' => array( '%message%' ),
+     'args' => array( 'mediawiki[%extra.process_id%]: %message%' ),
      ),
  ),

Details

Related Gerrit Patches:
mediawiki/extensions/DonationInterface : deployment[TEMPORARY] Instrument to tickle error streams
mediawiki/extensions/DonationInterface : master[TEMPORARY] Instrument to tickle error streams

Event Timeline

awight created this task.Oct 16 2015, 6:32 PM
awight renamed this task from Format Monolog messages to be easier to grep to Format Monolog messages to be easier to grep, bucket them.
awight claimed this task.
awight raised the priority of this task from to High.
awight updated the task description. (Show Details)
awight set Security to None.
awight edited a custom field.
awight added subscribers: atgo, Aklapper, awight, Jgreen.

I think this is the one, gotta test it though:

'formatters' => array(
    'line' => array(
        'class' => 'Monolog\Formatter\LineFormatter',
        'args' => array( 'mediawiki[%extra.process_id%] %message%' ),
    ),
),

That was the one. Well, plus a missing colon. The stable part of these messages is "mediawiki[PID]: ", is that enuf to bucket on?

On my local, syslog reads like:

Nov 19 12:06:45 local mediawiki: mediawiki[1054]: [15405221] /index.php/Main_Page Exception from line 180 of /X/payments/includes/MediaWiki.php: BAD FOO #0 /X/payments/includes/MediaWiki.php(567): MediaWiki->performRequest() #1 /X/payments/includes/MediaWiki.php(415): MediaWiki->main() #2 /X/payments/index.php(41): MediaWiki->run() #3 {main}

awight updated the task description. (Show Details)Nov 19 2015, 8:14 PM
awight moved this task from Backlog to Review on the Fundraising Sprint X-Ray Spex board.
awight moved this task from Review to Done on the Fundraising Sprint X-Ray Spex board.

@Jgreen
The new message format is deployed. These errors are very rare AFAIK, and I was not able to generate one without patching the code. Let me know if you think that's necessary.

awight updated the task description. (Show Details)Nov 19 2015, 8:39 PM
awight reassigned this task from awight to Jgreen.Nov 19 2015, 11:22 PM

Reassigning to JGreen to indicate that I'm too lazy to make a subtask.

@Jgreen
The new message format is deployed. These errors are very rare AFAIK, and I was not able to generate one without patching the code. Let me know if you think that's necessary.

I'm unable to find a single instance in logs since 11/10, either central or locally logged (I checked only payments1001). You sure this thing is on? Matching $programname on mediawiki should be no problem, but I'm not sure where to apply the filter since I'm not seeing any logs.

Okay, assigning back to myself--This is probably because no fatal errors have happened. I'll have to deploy a bug and tickle it.

awight claimed this task.Nov 23 2015, 7:55 PM
awight moved this task from Done to Doing on the Fundraising Sprint X-Ray Spex board.

Change 255303 had a related patch set uploaded (by Awight):
[TEMPORARY] Instrument to tickle error streams

https://gerrit.wikimedia.org/r/255303

awight added a comment.EditedNov 25 2015, 12:49 AM

@Jgreen
The "tickle" patch will allow us to cause loglines to go into the new fatal error stream. Only these two messages should be bucketed:

Nov 24 16:40:39 X mediawiki: mediawiki[19997]: [510dd5a1] [no req] Exception from line 26 of /X/payments/extensions/DonationInterface/maintenance/TestCrash.php: CRASHTEST: uncaught exception #0 /X/payments/maintenance/doMaintenance.php(103): TestCrash->execute() #1 /X/payments/extensions/DonationInterface/maintenance/TestCrash.php(38): include() #2 {main}

and

Nov 24 16:41:44 X mediawiki: mediawiki[20990]: [dfe2919d] [no req] ErrorException from line 264 of /X/payments/includes/exception/MWExceptionHandler.php: Fatal Error: Class undefined: FOO MediaWiki or an installed extension requires this class but it is not embedded directly in MediaWiki's git repository and must be installed separately by the end user. Please see <a href="https://www.mediawiki.org/wiki/Download_from_Git#Fetch_external_libraries">mediawiki.org</a> for help on installing the required components. #0 (): MWExceptionHandler::handleFatalError() #1 {main}

However, we should *not* see the following message in that bucket, it should appear in the normal payments.error stream:

CRASHTEST: error

I haven't deployed the code yet, just pasting those loglines while I had them available.

Change 255307 had a related patch set uploaded (by Awight):
[TEMPORARY] Instrument to tickle error streams

https://gerrit.wikimedia.org/r/255307

Change 255307 merged by jenkins-bot:
[TEMPORARY] Instrument to tickle error streams

https://gerrit.wikimedia.org/r/255307

Change 255303 merged by jenkins-bot:
[TEMPORARY] Instrument to tickle error streams

https://gerrit.wikimedia.org/r/255303

@Jgreen
I've tickled the logs with the tool I described above, timestamps are between Nov 26 01:30 to 01:40 UTC. I don't see anything anywhere yet... I'll leave the code deployed so we can generate more logs if needed...

@awight this looks like it should be closed?

@atgo We're still working on it...

The only difference I see between my configuration and production is pretty substantial: I'm using PHP 5.6 and HHVM, but production uses PHP 5.3.

Perhaps we can just bucket the line that PHP emits, by matching:

php: PHP Fatal error:

bd808 confirmed that PHP 5.3 is not able to catch these fatal errors, so matching PHP Fatal error is our only hope. Wheee!

Jgreen claimed this task.Dec 2 2015, 6:53 PM
Jgreen added a comment.Dec 3 2015, 1:46 PM

Perhaps we can just bucket the line that PHP emits, by matching:

php: PHP Fatal error:

Yup we can do that. I see 80 or so of those in the payment-listener-misc log. What would you like for the new bucket file name?

@Jgreen maybe payments.fatal?

Looks like we could use a bucket for "PHP Warning:" as well...

Jgreen added a comment.Dec 5 2015, 3:39 PM

directed into file payments.warning?

awight added a comment.Dec 5 2015, 3:50 PM

Hehe, yeah that sounds perfect. Sorry that I was too sad about that naming
convention to draw the final dotted line. It bothers me that we use both
dot and hyphen, but I'm sure there was a good reason...

Jgreen added a comment.Dec 5 2015, 4:00 PM

Hah well that's easily remedied if we care enough!

Jgreen closed this task as Resolved.Dec 7 2015, 7:20 PM

done!

mmodell removed a subscriber: awight.Jun 22 2017, 9:52 PM