Page MenuHomePhabricator

API should use status codes for errors
Closed, DeclinedPublic

Description

I think it is terrible that the API doesn't make use of response codes more. As a javascript developer it seems completely wrong to me that I have to mix error handling with success handling.

I understand there are backward compatibility issues but it should be possible to access an alternative flavour of the API (E.g. accessed via http://en.wikipedia.org/w/apiv2.php)

for example it would...
return 404's when I request a page that doesn't exist
a 400 bad request when I provide the wrong parameters
and 403 Forbidden when I try to do something I am not authorised to do

Some of the time when writing code that error handles I want to be able to quickly handle errors without resorting to inspecting the contents of the response e.g. on a 403 I can deduce that the user is not logged in or has irrelevant permissions and trigger a relevant call to action (e.g. redirect to login page/inform the user they are trying to do something they are not supposed to).

The existing response text would be packaged in the response body for situations where it is really needed.

I believe such a change would make the API more attractive to developers and easier to explore. Please lets do something about this.

See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html


Version: unspecified
Severity: enhancement

Details

Reference
bz38716

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:03 AM
bzimport added a project: MediaWiki-API.
bzimport set Reference to bz38716.
bzimport added a subscriber: Unknown Object (MLST).

return 404's when I request a page that doesn't exist
a 400 bad request when I provide the wrong parameters
and 403 Forbidden when I try to do something I am not authorised to do

And what do you suggest for the following query: http://en.wikipedia.org/w/api.php?action=query&list=allimages&aifrom=B&ailimit=502&titles=dog|A_bird_named_sam&prop=info|pencil

*200 because it retrieved valid info for the article Dog
*400 because there is no prop named "pencil"
*403 because I don't have a apihighlimits right needed to set ailimit > 500
*404 because there is no article with the name A bird named sam
*303/301/302 because "dog" should actually be "Dog"
*418 because newbies who make stupid api requests such as my contrived example do not get invited to a "coffeehouse"

(207 is cheating ;) Also if we're being anal, I suppose 401 might be more appropriate than 403


Sufficed to say, I am opposed to using status codes in this manner. It may be ok for things that are fatal api errors (like things that output "errors" not warnings. list=deletedrevs when you don't have permission), but i'm not exactly convinced of that either.

Well.. since this uri currently returns content in a new api it would still return a 200 OK as it returns content.

The warnings are not error messages and just there for further information. A developer doesn't need to worry about them. I don't see any issue with altering the content of the response, I just believe that it should be possible to handle errors better in javascript.

This is not about being anal, it's about being more helpful and giving richer information in our responses.

If I login and get a 400 I can quickly deduce from the error code alone that the username or password is wrong. I shouldn't have to resort to decoding the response body to work out what went wrong...

I am confused to where your opposition lies....

(In reply to comment #2)

Well.. since this uri currently returns content in a new api it would still
return a 200 OK as it returns content.
The warnings are not error messages and just there for further information. A
developer doesn't need to worry about them. I don't see any issue with altering
the content of the response, I just believe that it should be possible to
handle errors better in javascript.

Bawolff's point was that there are a lot of edge cases. Developers and programmers like reliability and having their expectations met. If the status codes perfectly aligned with every API request, it'd be great to use them. But given that they'll have to be imperfectly aligned, I'm not sure if this will create a better or worse situation. If there's ambiguity in what the response code will be when you make a request such as pulling links to a page that's been deleted (would that be a 200 because there are results [and what if there are no results?] or a 404 because the page has been deleted?), you end up making the situation worse for developers, I think. I'm not sure how you'd avoid ambiguity except by limiting different response codes to very limited situations.

Where are you seeing this problem specifically? It'd be helpful if you could elaborate on the problems you're encountering in JavaScript. Feel free to paste snippets of code if you'd like, comparing what you do vs. what you could do if this bug were resolved. This bug is lumping all API requests with all status codes, but it may make sense to only change the response header in specific situations (such as when logging in via the API). Or maybe that would be worse.

This is not about being anal, it's about being more helpful and giving richer
information in our responses.

This is kind of silly. The response "you supplied an invalid password' or whatever is vastly more helpful and richer in information than an error code. I think your argument is that better programmatic responses would be nice to have. This may be, but I think you should make a stronger case.

(In reply to comment #0)

I understand there are backward compatibility issues but it should be possible
to access an alternative flavour of the API (E.g. accessed via
http://en.wikipedia.org/w/apiv2.php)

Versioning the API is covered by another bug, I think. This bug needs an accompanying RFC and a lot more thought if you're really considering version 2 of the MediaWiki API. That isn't a small project. :-)

This is not about being anal, it's about being more helpful and giving richer
information in our responses.

Btw, just to clarify, when I used the word "anal" in my original comment, I wasn't trying to suggest your idea was anal or anything like that, just that there's a variety of status codes one could use depending on how close one wants to be to their "definitions"


This is kind of silly. The response "you supplied an invalid password' or
whatever is vastly more helpful and richer in information than an error code. I
think your argument is that better programmatic responses would be nice to
have. This may be, but I think you should make a stronger case.

While to be fair, I think he's suggesting there still be a response body with detailed information.

If I login and get a 400 I can quickly deduce from the error code alone that
the username or password is wrong. I shouldn't have to resort to decoding the
response body to work out what went wrong...

Or you forgot to supply a token (and possibly your client doesn't even know such a thing exists, so automatically interpreting 400 as login error would be incorrect). And logically speaking a "Access denied" error is not really the same as a "invalid parameter" error


Bawolff's point was that there are a lot of edge cases. Developers and
programmers like reliability and having their expectations met. If the status
codes perfectly aligned with every API request, it'd be great to use them. But
given that they'll have to be imperfectly aligned, I'm not sure if this will
create a better or worse situation. If there's ambiguity in what the response
code will be when you make a request such as pulling links to a page that's
been deleted (would that be a 200 because there are results [and what if there
are no results?] or a 404 because the page has been deleted?), you end up
making the situation worse for developers, I think. I'm not sure how you'd
avoid ambiguity except by limiting different response codes to very limited
situations.

Yep, that sums up what I was trying to say pretty well.

Also something I thought of right now, if the API has weird status codes, that could interfere with caching mechanisms (maxage and smaxage parameters of the api). On a similar note, returning 404 if the article is missing might be confusing to a client trying to establish if only the client is missing, or if the api itself is actually missing.


The warnings are not error messages and just there for further information. A
developer doesn't need to worry about them.

As far as I can tell the majority of the examples you have given are "warnings" not "errors" (and the login example is neither)

Just to be clear I appreciate you both taking the time to discuss :-) For a bit of background I have spent most of my time working with RESTful apis - so I have been a bit spoilt. I'd love this kind of API for mediawiki but this is likely to be a big task. Although using response codes would also be a big task it seems something that we can introduce slowly. To be clear as bawolf picked up - I am not adverse to having the richer data about the error sent in the response code. e.g. a 400 would be accompanied with the JSON string { error: { ... however some developers can ignore it in cases where it just doesn't matter.

My main point of frustration is that to a newcomer to your code, it seems very strange that you'd package an error handler into your success callback

To give a concrete example, you should remember that it is still possible for me to cause a 500 error using the api. This means I end up with a complicated error handler which first checks status code and then on a 200 inspects the result. My success handler must also do a conditional if statement and is not as clean as it could be. With status codes my success handler gains a much cleaner workflow - See http://pastebin.com/Nvg82B8w - it just feels cleaner... and this guy [1] seems to agree. It also opens up a path to having a generic error handler for all requests.

It's also worth noting that any api response is cacheable and could technically be picked up by search engines. Using response codes such as 404 for errors would fix this.

Another benefit of using response codes is analytics. It would be trivial to see how often the api is hit with invalid requests and to identify rooms for improvement as status codes are cached in log files.

[1] http://www.bennadel.com/blog/1860-Using-Appropriate-Status-Codes-With-Each-API-Response.htm

Just to be clear I appreciate you both taking the time to discuss :-)

However, keep in mind that you're discussing this with me and MzMcBride. Neither of us are exactly core API contributors by any stretch of the imagination ;). But nonetheless I love giving my 2 cents worth on bugs.

My main point of frustration is that to a newcomer to your code, it seems very
strange that you'd package an error handler into your success callback

I could certainly understand how it does provide a logical separation of code, particularly in jQuery's ajaxy functions. However that varries wildly depending on your programming environment. People could just as easily write a layer on top of jQuery to check for errors in MW api's output and send to either the error or the success handler.

My most major concern with such "restful" approaches is that it would be hard to apply consistantly. If some pages are missing and not others (Even if all pages are missing, that's not even considered an error in the current code). If some of the parameters are invalid but not others (although a case could be made that a request should either be entirely valid, or not valid at all). To be consistent we should respond similarly for a partially wrong request as we do for an entirely wrong, but an error code can only apply to the entire request.

We do use error codes in the normal interface for page not found, if the page does not exist.

It's also worth noting that any api response is cacheable and could technically
be picked up by search engines. Using response codes such as 404 for errors
would fix this.

Well the robots.txt fixes that issue nicely (At least in Wikimedia land). There are use cases for having the api response for non-private requests stick around in the squid cache for a bit. If search engines manage to spider a public api response, it aint exactly a big deal either.

To give a concrete example, you should remember that it is still possible for
me to cause a 500 error using the api.

Only in cases of PHP fatal errors, and squid timeouts I thought? (Which would entirely be appropriate even if we didn't use status codes in general, because that is signaling that the layer below the api failed).

...and this guy [1] seems to agree. It also opens up a path to having a generic
error handler for all requests.

No doubt. Even though REST-fullness isn't quite the buzzword it used to be, it still has many many fans. Personally I've always thought it was a bit over-rated, but that is just me (I don't mean that restful api's are bad per se, just that restful paragadigm is not a magic bullet, and its just as easy to design a bad api that is restful as it is to design a bad api that is not-restful. Case in point, trying to get anything useful out of the restful version of the bugzilla api).

Non-trivial generic error handlers are quite difficult to write ( API returned an error (any error, with exception to maybe 504 from squids), what the client should do varries greatly depending on the situation

Another benefit of using response codes is analytics. It would be trivial to
see how often the api is hit with invalid requests and to identify rooms for
improvement as status codes are cached in log files.

Just as trivial to use MW's internal logging infastructure. (Although for third party re-users who want that info, the status code may be easier for them). We already keep track of which methods take the most time and stuff ( http://gdash.wikimedia.org/dashboards/apimethods/ ) I doubt adding response codes would make it any more or less dificult to track response types.

Another concrete example:
My Wiki Loves Monuments app has been using this request since the dawn of time - and I only just noticed that this is actually not returning what I want. All looked well previously but in fact I'd prefer it to give me a 400 response code as I'm not actually getting what I want in the response

The warning message given is ""Too many values supplied for parameter 'titles': the limit is 50""
yet it still makes me think everything is fine by giving me a 200 OK.

https://commons.wikimedia.org/w/api.php?action=query&titles=File%3ASchw%C3%A4nberg+altes+Rathaus+aussen+von+S%C3%BCdwesten+nahe.JPG%7CFile%3AZug+Archiv+des+Amtes+f%C3%BCr+Denkmalpflege+und+Arch%C3%A4ologie%2C+Archiv+f%C3%BCr+Bauernhausforschung%2C+Museum+f%C3%BCr+Urgeschichte+1.JPG%7CFile%3ASilenen-Bauernhaus-49.jpg%7CFile%3ABauernhaus+Ins+(2).jpg%7CFile%3APohlern+Rohrmoos.jpg%7CFile%3ALangnau+Untere+D%C3%BCrsr%C3%BCti.jpg%7CFile%3ALenzburg+burghalde.jpg%7CFile%3ALangnau+Ch%C3%BCechlihus+mit+Dorfbrunnen.jpg%7CFile%3ALuetzelflueh+Kulturmuehle+4.jpg%7CFile%3ASilenen-Sust.jpg%7CFile%3AGuemmenen+Viadukt+mit+TGV+01+09.jpg%7CFile%3ASBB+Aarebruecke+Koblenz+(AG)+05+09.jpg%7CFile%3AZweibruggen+Eisenbr%C3%BCcke+West+2.JPG%7CFile%3ABaumallee+in+der+Elfenau.JPG%7CFile%3AOberdiessbach+Familienkapelle-1.jpg%7CFile%3ALa+Neuveville+Fontaines+des+Bannerets2.jpg%7CFile%3ALigerz+Gaberelhaus+frontal.jpg%7CFile%3AGarten+Schmidlin+2.jpg%7CFile%3AGarten+der+Villa+Bomonti+DSC05862.jpg%7CFile%3AReichenbach+BE%2C+Gasthaus+B%C3%A4ren+(1542).jpg%7CFile%3ABern-Zunfthaus-zu-Pfistern.jpg%7CFile%3AGoldhubel+Aegerten+Sud.jpg%7CFile%3AZofingen-Heiterenplatz.jpg%7CFile%3ALauperswil+Chalchmatt2.jpg%7CFile%3ALa+Neuveville+Hof+Ligerz.jpg%7CFile%3AGuemligen+Hofgut+Front.jpg%7CFile%3AInstitutsgeb%C3%A4ude+Hofwil+M%C3%BCnchenbuchsee1.jpg%7CFile%3AMh+JB+kleine+scheidegg.jpeg%7CFile%3AEndingen+J%C3%BCdischer+Friedhof.jpg%7CFile%3AMoutier+Kapelle+von+Chali%C3%A8res+mit+Friedhof.jpg%7CFile%3AFrick-Kath-Kirche.jpg%7CFile%3ADietwil-Preghejo+250.jpg%7CFile%3AFischbach-Goeslikon+peghejo+190.jpg%7CFile%3ALangenthal-Kaufhaus.jpg%7CFile%3ALauperswil+Kirche2.jpg%7CFile%3ALigerz+Kirche3.jpg%7CFile%3APZMTotale05557.jpg%7CFile%3AOberdiessbach+Diessenhof-2.jpg%7CFile%3AMontenach+Country+Estate+Balliswil+Feb+2011.jpg%7CFile%3AReichenbach+BE%2C+Letzi+M%C3%BClenen.jpg%7CFile%3AKGS-Wimmis-Letzi-(1).JPG%7CFile%3ALa+Neuveville+Bernerhaus.jpg%7CFile%3AMaison-Des-Dragons-La-Neuveville.jpg%7CFile%3AMeienberg+Stadtgraben.jpg%7CFile%3ASilenen-Meierturm-2.jpg%7CFile%3ALangenthal-Muehle.jpg%7CFile%3ABois+murat+(9).jpg%7CFile%3ALotzwil+Pfarrhaus2.jpg%7CFile%3ALuetzelflueh+Pfarrhaus.jpg%7CFile%3APfarrhaus00714.jpg%7CFile%3ALa+Neuveville+Rathaus.jpg%7CFile%3AKirche+Langnau+im+Emmental.jpg%7CFile%3AKirche+und+Friedhof+Limpach.JPG%7CFile%3AGr%C3%A4nichen+Kirche+innen.jpg%7CFile%3A6030+-+Meiringen+-+Evangelisch-Reformierten+Kirche.JPG%7CFile%3ABlanche-Eglise-La-Neuveville.jpg%7CFile%3AKGS+Densbueren-Rne-u-Hochwacht-Urgiz-(20).JPG%7CFile%3AGuemligen+Schloss+mit+Park.jpg%7CFile%3ASchloss+Hofwil+M%C3%BCnchenbuchsee1.jpg%7CFile%3ASchloss+Laupen+S%C3%BCdansicht.jpg%7CFile%3AM%C3%BCnchenwiler_Castle_Apr_2011.jpg%7CFile%3ASchloss+Nidau+Schlosspark.jpg%7CFile%3ASchlossOberhofen+6376.jpg%7CFile%3AAltenburg+Schloesschen.jpg%7CFile%3ABasel-Stadtmauer.jpg%7CFile%3APicswiss+ZG-04-01.jpg%7CFile%3AZug+Stallscheune+Nr.+424d+7.jpg%7CFile%3ACham+Stallscheune+Nr.+110b+2.jpg%7CFile%3AZug+Stallscheune+R%C3%BCschenhof+Nr.+424e+3.jpg%7CFile%3AFuniculaire+Fribourg+119.JPG%7CFile%3AEndingen+Synagoge.jpg%7CFile%3ABaden+Synagoge+Winter.jpg%7CFile%3AUntervogthaus+Gr%C3%A4nichen.jpg%7CFile%3AHolzbruecke+Urnaesch+St.+Gallen+01+09.jpg%7CFile%3AWichterheergut.jpeg%7CFile%3AHofstatt_im_D%C3%B6rfli_Wolfenschiessen_Jun_2012.jpg%7CFile%3ARinglihaus.jpg%7CFile%3ASilenen-Burghofstatt.jpg%7CFile%3ABaar+Wohnhaus+Nr.+196a+Sennweid+6.jpg%7CFile%3AZug+Wohnhaus+Nr.+354a+Oterswil+2.jpg%7CFile%3ACham+Wohnhaus+Nr.+49a+1.jpg%7CFile%3ABasel+032.jpg&prop=imageinfo&iiprop=url&format=json&iiurlwidth=64&iiurlheight=64

The warning message given is ""Too many values supplied for parameter 'titles':
the limit is 50""

The request still succeeded for the first 50 titles. (400 would really only be appropriate if we made the request totally fail). If the request was made to totally fail, you would notice pretty instantly regardless of the status code returned.

This is where I disagree and why I raised the bug originally.

I asked for /more than 50 titles/. I didn't get all of them. Thus my request was not honoured so it should not 200 OK me. It wasn't a success in my eyes and I would prefer to have been told to alter my request so that I can get them all. The warning is not obvious enough and can easily be overlooked.

If I'd requested just 50 titles and got them all back that would be a success.

As a developer an error code would have made this more clear to me - this type of thing allows me to think about what happens when I don't get the full results and improve my code (e.g. using 2 requests to get 100 titles).

Fundamentally I prefer strict apis to fuzzy ones - there is less room for error when writing code.

Fundamentally I prefer strict apis to fuzzy ones - there is less room for error
when writing code.

I believe that's a separate issue though (Requesting that the api should fail when asking for too many titles, vs this being about how the api should fail [with an http status code])

IMO, the comparison to a RESTful API is fallacious since the MediaWiki API is ''not'' RESTful. In a RESTful API, for example, you'd not be able to specify multiple titles in the query or get back information on multiple titles beyond a "directory listing" of some sort. This is why RESTful API doesn't have this "some of the several resources you requested don't exist, some do, and some of the requests are malformatted" problems: it deals with one resource at a time.

(In reply to comment #6)

(Which would
entirely be appropriate even if we didn't use status codes in general, because
that is signaling that the layer below the api failed).

This, IMO, is a critical point. The API does not speak HTTP. It is using HTTP as a transport to return API results. Asking for the API to return an HTTP 404 message for the queried page being missing is like asking for a webserver to return an ICMP "Destination Unreachable" packet when a webpage is not found.

(In reply to comment #10)

Fundamentally I prefer strict apis to fuzzy ones - there is less room for error
when writing code.

I believe that's a separate issue though (Requesting that the api should fail
when asking for too many titles, vs this being about how the api should fail
[with an http status code])

IMO, this should be closed as WONTFIX and that filed as a separate feature request if Jon desires to follow up on it. OTOH, Jon could also easily enough have his framework check for warnings and treat them as errors.

Note, BTW, that it's very possible for the API to return less than 50 results for certain requests (e.g. prop=revisions&rvprop=content when the individual revisions are very large) even though the "limit" parameter nominally supports 50 as a maximum. In this case, returning 400 and expecting the client to requery with the arbitrary lower limit seems unhelpful.

This is not really about REST and discussion of REST is confusing the real issue I'm getting at here.

I am asking that we make warnings and errors more obvious. I keep running into various bugs or hearing about bugs with the api where I have been given warnings/errors and these haven't been surfaced well. Sometimes this is due to inconsistencies in the response (for example recently I discovered an error returned in the form { "1": { "error": {} } } rather than { "error": {} } and sometimes I just don't ever expect to get an error except in extreme edge cases which I'd rather avoid through good programming techniques then surface them to the user.

On the Wiki Loves Monuments app we keep running into issues with things such as logging in, failed uploads - and since the api returns no error codes it is near impossible to consult apache logs to work out what is happening.

I ran into a bug the other day where I tried to upload a file and got a 200 OK with the error message 'filename too long' - if we'd been able to monitor logs for 400s I am sure this bug would have got surfaced much earlier and dealt with. I hate to think what other bugs exist in the app due to this lack of visibility.

I'd much rather the api was stricter or could be signalled to be stricter (e.g. I add a query string parameter called strict). To me 400 and expecting the client to requery with a lower limit IS helpful as it is highlighting the fact that my code is making unreasonable queries. I would expect to rethink my code so that it avoids the 400 altogether and takes into account this problem. a 400 is much more noticeable to me then a 200 which by definition should mean "The request has succeeded." [1]

Currently the situation is like going to an ATM machine and asking for 400USD and only getting 300USD back without any explanation (in fact the ATM says "thank you!" and pretends everything went as expected).

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1

(In reply to comment #12)

I am asking that we make warnings and errors more obvious.

They're plenty obvious, IMO. An error is returned as a node 'error' within the root node, and warnings are returned inside a node 'warnings' within the root node. If you see something reporting API errors or warnings differently, report it as a bug.

(for example recently I discovered an error returned in the form
{ "1": { "error": {} } } rather than { "error": {} }

That would appear to be a bug. Details?

and sometimes I just don't ever expect to get an error except in extreme edge
cases which I'd rather avoid through good programming techniques then surface
them to the user.

That's not a problem with the API then, is it?

and since the api returns no error codes

Repeating erroneous statements doesn't make them true. Please stop.

I ran into a bug the other day where I tried to upload a file and got a 200 OK
with the error message 'filename too long' - if we'd been able to monitor logs
for 400s I am sure this bug would have got surfaced much earlier and dealt
with. I hate to think what other bugs exist in the app due to this lack of
visibility.

You're probably using some sort of framework (either home-grown or third-party), and that framework probably has a low-level function that sends the query to the server and parses the response. Right there, as soon as the response is parsed, log any errors or warnings or take whatever other action you need. Simple.

then a 200 which by definition should mean "The request has succeeded."

The HTTP request did succeed.

Currently the situation is like going to an ATM machine and asking for 400USD
and only getting 300USD back without any explanation (in fact the ATM says
"thank you!" and pretends everything went as expected).

Bad analogy: removing money from an ATM has consequences that making an API query doesn't.

I'm open to making the API return error codes, as long as:

  • they're hidden behind a query parameter so we don't break b/c. I guess eventually this could move to more versioning
  • they account for the fact that multiple things might be wrong in the same request (see comment 1) and handle this sanely

The second criterion is sort of vague, so for that one I'll judge patches on a case-by-case basis.

preilly wrote:

As, stated, "The API does not speak HTTP... it's using HTTP as a transport mechanism to return the API result sets." We don't and shouldn't hi-jack HTTP status codes for error handling purposes.

I'm closing this as "resolved" and "wontfix".

— Patrick

Talking to Patrick he suggested that we could log any errors in the api via php and surface those better to highlight error states better.

See bug 39592 for versioning the api