Page MenuHomePhabricator

ApiQueryLogEvents on enwiki can fatal with InvalidArgumentException due to unreadable Flow object
Closed, ResolvedPublic

Description

Error

Request ID: XCuxnApAEDUAAGEsPEgAAADP

message
{
    "error": {
        "code": "internal_api_error_InvalidArgumentException",
        "info": "[XCuxnApAEDUAAGEsPEgAAADP] Caught exception of type InvalidArgumentException",
        "errorclass": "InvalidArgumentException"
    },
    "servedby": "mw1288"
}
request
{
	"action": "query",
	"format": "json",
	"maxlag": "5",
	"list": "logevents",
	"leuser": "Reaper Eternal",
	"lelimit": "5000"
}
trace
2019-01-01 18:29:49 [XCuxnApAEDUAAGEsPEgAAADP] mw1288 enwiki 1.33.0-wmf.9 exception ERROR: [XCuxnApAEDUAAGEsPEgAAADP] /w/api.php   InvalidArgumentException from line 383 of /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiResult.php: Cannot add object to ApiResult {"exception_id":"XCuxnApAEDUAAGEsPEgAAADP","exception_url":"/w/api.php","caught_by":"mwe_handler"} 
[Exception InvalidArgumentException] (/srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiResult.php:383) Cannot add object to ApiResult
  #0 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiResult.php(371): ApiResult::validateValue(object)
  #1 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiResult.php(371): ApiResult::validateValue(array)
  #2 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiResult.php(411): ApiResult::validateValue(array)
  #3 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQueryLogEvents.php(249): ApiResult->addValue(array, NULL, array)
  #4 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuery.php(249): ApiQueryLogEvents->execute()
  #5 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiMain.php(1595): ApiQuery->execute()
  #6 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiMain.php(531): ApiMain->executeAction()
  #7 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
  #8 /srv/mediawiki/php-1.33.0-wmf.9/api.php(87): ApiMain->execute()
  #9 /srv/mediawiki/w/api.php(3): include(string)
  #10 {main}

Impact

Notes

  • Query done from a bot account
  • Changing lelimit to 1000 does not give the same error.

Event Timeline

JJMC89 created this task.Jan 1 2019, 6:48 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 1 2019, 6:48 PM
Anomie updated the task description. (Show Details)Jan 2 2019, 2:55 PM
Anomie moved this task from Unsorted to Non-core-API stuff on the MediaWiki-API board.
Anomie added a subscriber: Anomie.

The log entry causing the error seems to be https://en.wikipedia.org/wiki/Special:Log?logid=57466586. The specific cause seems to be that LogFormatter::newFromEntry( $logEntry )->formatParametersForApi() is returning an instance of Flow\Model\UUID as part of the log data. Normally that's ok as that class implements ApiSerializable, but Flow isn't actually installed on enwiki so the object is actually an instance of __PHP_Incomplete_Class.

What this probably means is that log entry cleanup for T188812: Uninstall Flow on all wikis where it has zero topics wasn't done or wasn't done completely.

Restricted Application added a project: Growth-Team. · View Herald TranscriptJan 2 2019, 3:20 PM
Krinkle renamed this task from Caught exception of type InvalidArgumentException to ApiQueryLogEvents on enwiki can fatal with InvalidArgumentException due to unreadable Flow object.Jan 2 2019, 9:21 PM
Catrope added a subscriber: Catrope.Jan 3 2019, 1:43 AM

I don't think we ever actually removed Flow log entries from enwiki, we just deleted the associated pages. Special:Log rendered the orphaned log entries OK, but I didn't realize at the time that the API wouldn't.

There aren't that many of them, 274 in total, so we could manually delete them:

mysql:research@s3-analytics-slave [enwiki]> select log_action, count(*) from logging where log_action like 'flow%' group by log_action;
+---------------------+----------+
| log_action          | count(*) |
+---------------------+----------+
| flow-close-topic    |       13 |
| flow-delete-post    |       66 |
| flow-delete-topic   |       87 |
| flow-lock-topic     |       31 |
| flow-restore-post   |       17 |
| flow-restore-topic  |       38 |
| flow-suppress-post  |       20 |
| flow-suppress-topic |        2 |
+---------------------+----------+
8 rows in set (3 min 51.20 sec)
Catrope moved this task from Inbox to Q2 2019-20 on the Growth-Team board.Jan 8 2019, 1:44 AM
Catrope moved this task from Q2 2019-20 to Needs Discussion/Analysis on the Growth-Team board.

Still seen on 1.33-wmf.21. Original test case simplified via the URL: https://en.wikipedia.org/w/api.php?action=query&format=json&maxlag=5&list=logevents&leuser=Reaper%20Eternal&lelimit=max&lestart=20150101121212

{"error":{"code":"internal_api_error_InvalidArgumentException","info":"[XJEzjQpAAC4AAGL-LS8AAAAK] Caught exception of type InvalidArgumentException","errorclass":"InvalidArgumentException"},"servedby":"mwdebug1002"}

Note that on PHP 7, it produces a different error message (but for the same underlying issue).

Fatal error:
is_callable(): The script tried to execute a method or access a property of an incomplete object. Please ensure that the class definition "Flow\Model\UUID" of the object you are trying to operate on was loaded _before_ unserialize() gets called or provide an autoloader to load the class definition

in /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiResult.php on line 338
JTannerWMF edited projects, added Growth-Team (Current Sprint); removed Growth-Team.

This seems to work on Special:Log, because the params are not used to show the line, only type, user and comment

It is okay for the api to not output details when the log type is unknown? That would be the easy fix, because the api does not know how to format this data

Not initially obvious how to reproduce locally since Flow was patched 4 years ago to store alphanumerical (string) IDs instead of UUID objects in the logs. Hacking Flow/includes/Log/ModerationLogger.php:53 to store UUID does the trick.

We could patch includes/logging/LogFormatter.php:821 with something like

if ( $value instanceof __PHP_Incomplete_Class ) {
    continue;
}

but that's very specific to this situation and I'd like other people's opinion about whether this is desirable in Core. There's other places we could do something similar but the basic idea would be the same: [silently] ignore bad data.

We could detect that the log type is not known but there's already code that does that and judging from the class comments on LegacyLogFormatter, it assumes that these are for logs that have not been converted to the "new" format.

Change 500466 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/core@master] LogFormatter: ignore unrecoverable data

https://gerrit.wikimedia.org/r/500466

I've proposed a patch to ignore the unrecoverable data. I like this idea in general since this table contains serialized data from various sources, some of them external, it seem reasonable for Core to be resilient and not allow bad data from a single source to break the entire API call. Someone with more experience with the logging code might find a better place for it though.

Independently of the proposed patch, we could cleanup the data (delete the rows, or just blank log_params) for the faulty rows. There seem to be only 107 of them.

mysql:research@dbstore1003.eqiad.wmnet [enwiki]> select count(*) from logging where log_action like 'flow-%' and log_params like '%UUID%';
+----------+
| count(*) |
+----------+
|      107 |
+----------+
Xaosflux removed a subscriber: Xaosflux.Apr 2 2019, 3:25 PM

Change 500466 merged by jenkins-bot:
[mediawiki/core@master] LogFormatter: ignore unrecoverable data

https://gerrit.wikimedia.org/r/500466

The code to make the Log API more resilient to bad data is merged.

I've cleaned up (blanked log_params) for the 107 old Flow log entries that had an instance of UUID serialized in their log_params.

Umherirrender closed this task as Resolved.Apr 10 2019, 7:12 PM

The logid is now working on the api (with special:badtitle, but the namespace is gone)

https://en.wikipedia.org/w/api.php?action=query&list=logevents&lestart=20140708171749&lelimit=1