Page MenuHomePhabricator

API endpoint should format fatal errors in the requested format, not in HTML
Closed, DuplicatePublic

Description

I've a 500 when pywikibot wants to fetch https://fr.wikipedia.org/w/api.php with these parameters

inprop=protection&gsrwhat=text&generator=search&format=json&gsrnamespace=0%7C1%7C2%7C3%7C4%7C5%7C6%7C7%7C8%7C9%7C10%7C11%7C12%7C13%7C14%7C15%7C2600%7C828%7C829%7C100%7C101%7C102%7C103%7C104%7C105%7C2300%7C2301%7C2302%7C2303&gsrsearch=%22Ssion%22&prop=info%7Cimageinfo%7Ccategoryinfo&iilimit=max&continue=&meta=userinfo&indexpageids=&action=query&gsrlimit=500&iiprop=timestamp%7Cuser%7Ccomment%7Curl%7Csize%7Csha1%7Cmetadata&uiprop=blockinfo%7Chasmsg

That corresponds to, unencoded:

inprop = protection 
& gsrwhat = text 
& generator = search 
& format = json 
& gsrnamespace = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 2600 | 828 | 829 | 100 | 101 | 102 | 103 | 104 | 105 | 2300 | 2301 | 2302 | 2303 
& gsrsearch = "Ssion" 
& prop = info | imageinfo | categoryinfo 
& iilimit = max 
& continue = 
& meta = userinfo 
& indexpageids = 
& action = query 
& gsrlimit = 500 
& iiprop = timestamp | user | comment | url | size | sha1 | metadata 
& uiprop = blockinfo | hasmsg

I can reproduce the error in the sandbox.

The content that is resulted is the generic HTML error page, not JSON as requested in the parameters.

If you report this error to the Wikimedia System Administrators, please include the details below.

PHP fatal error:
request has exceeded memory limit

Even if the search query is "too large", a response formatted in JSON should be returned, with a nice readable error code, and not unreadable HTML for programs. Similarly, error code 500 is certainly not desired.

So the error was probably cleared by a program above mediawiki itself.

Event Timeline

Restricted Application added projects: Discovery, Discovery-Search. · View Herald TranscriptJun 25 2018, 11:06 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Framawiki: What exactly is expected in this task, given that the error message is "request has exceeded memory limit"? Some requests might just be too large.
Removing MediaWiki-Search as it's not used on Wikimedia sites but CirrusSearch. No fatal error provided either, hence removing Wikimedia-production-error.

Framawiki updated the task description. (Show Details)Jun 26 2018, 5:46 PM

Hello @Aklapper. Thanks for your review, updated the desc with more info.
Note that the API should NEVER send a format different from the one requested !

Framawiki updated the task description. (Show Details)Jun 26 2018, 5:50 PM

@Framawiki: What is the expected outcome from this task? That the 'memory limit reached' response should be in JSON format instead? That you should not run into a 'memory limit reached' error message at all? Something else?
Please see https://mediawiki.org/wiki/How_to_report_a_bug which asks for 1) steps to reproduce, 2) expected outcome, 3) actual outcome in bug reports. Thanks!

EBernhardson added a subscriber: EBernhardson.EditedJun 26 2018, 6:45 PM

As far as i can tell from a quick run of this through mwdebug1001, the bulk of the memory and time used here is loading information about files in the result set.

Even if the rest of it didn't time out, this api call runs into other issues. Reducing gsrlimit to 250 the api includes the warning This result was truncated because it would otherwise be larger than the limit of 12,582,912 bytes. which implies even if this query didn't fatal it still wouldn't return the expected results. What to do with this will depend on what the desired outcome is.

Framawiki added a comment.EditedJun 26 2018, 7:53 PM

@Framawiki: What is the expected outcome from this task? That the 'memory limit reached' response should be in JSON format instead? That you should not run into a 'memory limit reached' error message at all? Something else?
Please see https://mediawiki.org/wiki/How_to_report_a_bug which asks for 1) steps to reproduce, 2) expected outcome, 3) actual outcome in bug reports. Thanks!

The main problem of this task is that the error is not handled correctly (I suppose that it's not handled at all by mediawiki). Solving this imply to format the answer in JSON as excepted by client.

I've haven't access to logstash (need to take care of T176364), so I can't write more information, other than the query result in an anormal error page that is definitely not good :)

Perhaps the memory limit can be increased, perhaps the code is not optimal and badly uses recursive function, ... We can imagine lot of different problems that can be subtasks of this one.

Framawiki renamed this task from Fatal PHP error on api action=query, generator=search: memory limit reached to Error memory limit reached not handled with action=query, prop=imageinfo.Jun 26 2018, 7:57 PM
Anomie added a subscriber: Anomie.Jun 28 2018, 10:45 AM

I'm on vacation at the moment, but since I see people haring off on this task I'll throw in a quick comment.

If the problem is the actual error reported here, this is a duplicate of T55663: Getting the referring pages takes up too much memory (due to included image metadata). iiprop=metadata can be problematic.

If the problem here is @Framawiki's assertion that no API call should ever result in a 500 with a non-API-formatted response body, the task should be rewritten to reflect that in the title and description instead of it being implied. Then, since that's not possible when something runs into a PHP fatal error like this, the task should be declined in favor of fixing PHP fatals through the normal task prioritization process.

Framawiki renamed this task from Error memory limit reached not handled with action=query, prop=imageinfo to API endpoint should format fatal errors in the requested format, not in HTML.Jun 30 2018, 8:40 AM
Vvjjkkii renamed this task from API endpoint should format fatal errors in the requested format, not in HTML to 2aaaaaaaaa.Jul 1 2018, 1:01 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
JJMC89 renamed this task from 2aaaaaaaaa to API endpoint should format fatal errors in the requested format, not in HTML.Jul 1 2018, 2:51 AM
JJMC89 raised the priority of this task from High to Needs Triage.
JJMC89 updated the task description. (Show Details)
JJMC89 added a subscriber: Aklapper.