Page MenuHomePhabricator

Fatal exception from ApiQueryBase: Call to getNamespace() on non-object null
Closed, ResolvedPublic

Description

Error

Request ID: W99G7QpAEMMAAK@IFH8AAAAY

message
BadMethodCallException:
Call to a member function getNamespace() on a non-object (null)
trace
#0 /srv/mediawiki/php-1.33.0-wmf.1/extensions/PageViewInfo/includes/ApiQueryMostViewed.php(58): ApiQueryBase::addTitleInfo(array, NULL)
#1 /srv/mediawiki/php-1.33.0-wmf.1/extensions/PageViewInfo/includes/ApiQueryMostViewed.php(20): MediaWiki\Extensions\PageViewInfo\ApiQueryMostViewed->run()
#2 /srv/mediawiki/php-1.33.0-wmf.1/includes/api/ApiQuery.php(249): MediaWiki\Extensions\PageViewInfo\ApiQueryMostViewed->execute()
#3 /srv/mediawiki/php-1.33.0-wmf.1/includes/api/ApiMain.php(1570): ApiQuery->execute()
#4 /srv/mediawiki/php-1.33.0-wmf.1/includes/api/ApiMain.php(531): ApiMain->executeAction()
#5 /srv/mediawiki/php-1.33.0-wmf.1/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#6 /srv/mediawiki/php-1.33.0-wmf.1/api.php(87): ApiMain->execute()
#7 /srv/mediawiki/w/api.php(3): include(string)

Impact

Certain user queries to the API are consistently unavailable due to an error.

The queries are publicly exposed and produce an HTTP 500 error (fatal exception, uncached).

Notes

One of the affected urls is https://de.wikipedia.org/w/api.php?action=query&format=json&list=mostviewed&pvimlimit=500, from which it can be consistently reproduced:

{"error":{"code":"internal_api_error_BadMethodCallException","info":"[W99b9wpAIDcAAEbvWToAAACY] Caught exception of type BadMethodCallException"},"servedby":"mw1343"}

Event Timeline

Krinkle created this task.Nov 4 2018, 8:52 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 4 2018, 8:52 PM

It seems the PageViewService returns an invalid title for the specific wiki
The error persists until the invalid page is no longer viewed so often.

At the moment it is the 43rd entry, because that fails, while limit=42 works.

I have no idea how to find that page in the result. Someone from the page view team must have a look if the raw data could be queried and looked at.

The api module should be changed to be safer for such invalid input from the internal service

The request that is failing is https://de.wikipedia.org/w/api.php?action=query&format=json&list=mostviewed&pvimlimit=500. Lowering the limit to 32 works at https://de.wikipedia.org/w/api.php?action=query&format=json&list=mostviewed&pvimlimit=32. As @Umherirrender mentions, this means one of the entries (e.g. number 33) has an invalid title that MediaWiki code Title::newFromText is rejecting.

The internal request is to https://wikimedia.org/api/rest_v1/metrics/pageviews/top/de.wikipedia.org/all-access/2019/02/10, which seems to be working fine.

Unfortunately, it also contains 1000 entries, and its order doesn't match the one from the MW output, so finding the bad title might be difficult.

Logically, though, it makes perfect sense for there to be invalid titles in the Page View API output because its source data is not MediaWIki. Rather, it's source data is raw web requests received by Varnish. Navigating to https://en.wikipedia.org/wiki/Something:I_just_made_up will respond with 404 but nonetheless record it was a page view. Similarly, viewing https://en.wikipedia.org/wiki/Foo[]bar might also record a view for it, despite it being an invalid title (responds with 400 Bad Request - Special:Badtitle, instead of 404).

After trying a few random titles from internal de.wikipedia.org pageview/top response, I found an invalid title:

{
  "article": "St�ckgut",
  "views": 14298,
  "rank": 48
},
{
  "article": "Sch�ttgut",
  "views": 14120,
  "rank": 49
},

This appears to be an invalidly encoded character meant to be ü (U-umlaut). Anyhow, as mentioned above, invalid titles are expected in the Page View API, and thus the extension should filter them out.

Krinkle claimed this task.Feb 11 2019, 10:36 PM

Change 489929 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/PageViewInfo@master] Skip invalid titles from pageview API in ApiQueryMostViewed

https://gerrit.wikimedia.org/r/489929

Krinkle removed Krinkle as the assignee of this task.Feb 12 2019, 1:16 AM
Krinkle triaged this task as Low priority.Feb 12 2019, 2:39 AM

Change 489929 merged by jenkins-bot:
[mediawiki/extensions/PageViewInfo@master] Skip invalid titles from pageview API in ApiQueryMostViewed

https://gerrit.wikimedia.org/r/489929

Change 489929 merged by jenkins-bot:
[mediawiki/extensions/PageViewInfo@master] Skip invalid titles from pageview API in ApiQueryMostViewed
https://gerrit.wikimedia.org/r/489929

This can result in less pages than the limit in the result without continue=. Maybe the fact needs documentation.
I have no idea if the include invalid one should also be added with the count but without namespace= and an extra invalid= parameter or such.

This can result in less pages than the limit in the result without continue=. Maybe the fact needs documentation.

It's specifically allowed for an Action API module to return fewer results than the requested limit. Whether that's specifically called out in the module's documentation or not is up to the maintainer of the module; in core we generally only mention it when it's reasonably likely to return 0 pages (but still have continuation).

I don't see any way for this change to prevent continuation from being applied when appropriate. Maybe I misunderstood what you meant by "without continue="?

I do note one bug in the patch, though, that would lead to repeated results when continuing. Added a comment there.

I have no idea if the include invalid one should also be added with the count but without namespace= and an extra invalid= parameter or such.

Wouldn't hurt to do so, if someone wants. That would be a separate task.

Yah, it needs to be "with continue=". The api returns in most cases the requested limit and a continue or less than the limit and no continue. I know some modules, which have a hint, that there can return less pages. But I was unsure if that was required or not. But it is okay for me, when it is up to the maintainer.

Change 490952 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/PageViewInfo@master] Increment offset counter in ApiQueryMostViewed even for invalid titles

https://gerrit.wikimedia.org/r/490952

Change 490952 merged by jenkins-bot:
[mediawiki/extensions/PageViewInfo@master] Increment offset counter in ApiQueryMostViewed even for invalid titles

https://gerrit.wikimedia.org/r/490952

Krinkle closed this task as Resolved.Feb 17 2019, 12:54 AM
Krinkle claimed this task.