Page MenuHomePhabricator

api.log does not indicate errors and exceptions
Closed, ResolvedPublic

Description

Looking at https://gerrit.wikimedia.org/r/#/c/240614 , I notice ApiMain.php does not pass isError or the exception to logRequest(). So you can't tell from api.log if we're getting billions of bad and failing API requests.

Here are some actual api.log lines:

2015-09-24 21:42:48 mw1205 mediawikiwiki api INFO: API GET SPage_(WMF) 104.220.68.135 T=216ms action=parse page=Main_Page
2015-09-24 21:43:07 mw1206 mediawikiwiki api INFO: API GET SPage_(WMF) 104.220.68.135 T=206ms action=parse page=Justin_Timberlake
2015-09-24 21:58:28 mw1229 mediawikiwiki api INFO: API GET SPage_(WMF) 104.220.68.135 T=177ms action=parse text=Hello contentmodel=wikiwacky

You can't tell the second one failed with error.code: "missingtitle" and the third with "unknown_contentmodel".

The error is reflected in the custom HTTP response header

MediaWiki-API-Error: missingtitle

but we don't log that.

The fix is to log the API response's error.code the same way that it's output in the MediaWiki-API-Error header, maybe by passing an $error parameter to logRequest().

Related Objects

Event Timeline

Spage raised the priority of this task from to Medium.
Spage updated the task description. (Show Details)

If

then maybe we don't need to add this to api.log. In gerrit 240614 anomie noted that api.log is more about capturing the request than the response.

@Spage if you think error codes should be logged please update T102079 so there is a single canonical description of what needs to be logged.

It looks like we will have to do some refactoring of ApiMain to expose the error code in a way that we can get it into the logging context. The best way to do this might be as a follow-on to T108618 where we will be creating structured logging for tracking each Action API request.

One other thing to keep in mind is that I'm hoping to make it possible for there to be multiple errors as part of T47843: Rework API error reporting.

It may be that this bug is mistitled and error information should go to the new hadoop channel instead of api.log.

@Spage if you think error codes should be logged please update T102079 so there is a single canonical description of what needs to be logged.

✓Done.

It looks like we will have to do some refactoring of ApiMain to expose the error code in a way that we can get it into the logging context.

For now just add an errorNexus parameter to logRequest() which will be "exception from ApiBeforeMain" or "exception from executeAction". We don't need perfection, just "there was an error in this API request".

Or we could add the MediaWiki-API-Error header to the webrequest table, which seems to reliably contain the error.code from the API request.

Change 271673 had a related patch set uploaded (by BryanDavis):
Add request error state to ApiBase::logRequest

https://gerrit.wikimedia.org/r/271673

Change 271673 merged by jenkins-bot:
Add request error state to ApiBase::logRequest

https://gerrit.wikimedia.org/r/271673

ApiMain::logRequest() now takes an optional exception argument and will extract error codes via the new protected errorMessageFromException() function. The error status and codes are passed to the debug logging channel that will be used in WMF production to feed Hadoop/Hive.