Page MenuHomePhabricator

API JSON formatter returns [] as an empty return value - inconsistent with {} for non-empty values
Closed, ResolvedPublic

Description

Author: Bryan.TongMinh

Description:
Compare a query with results and without:
http://en.wikipedia.org/w/api.php?action=query&generator=embeddedin&geititle=Template:Stub&geilimit=5&prop=info&inprop=protection&format=jsonfm
http://en.wikipedia.org/w/api.php?action=query&generator=embeddedin&geititle=Template:Stubaaaa&geilimit=5&prop=info&inprop=protection&format=jsonfm

The query with results returns an JSON object ({...}), while the query without results returns an empty JSON list ([]). For consistency, it should return an empty object {}.


Version: 1.21.x
Severity: trivial
URL: http://en.wikipedia.org/w/api.php?action=query&generator=embeddedin&geititle=Template:Stubaaaa&geilimit=5&prop=info&inprop=protection&format=jsonfm

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 9:50 PM
bzimport set Reference to bz10887.
bzimport added a subscriber: Unknown Object (MLST).

Note that lists and dictionaries/associative arrays are the same type in PHP, making it somewhat difficult to do this consistently without some manual poking about -- you can't tell an empty "list" from an empty "dictionary". You'd have to know what type it 'should' be, or wrap it in some kind of special wrapper object.

Bryan.TongMinh wrote:

Or maybe this is more preferably: {'query':{'pages':{}}}. That still leaves the associative array problem of course.

We could return null instead of a list/array - removes the problem all together...

Is [] illegal in JSON then? If [] is legal, there shouldn't be much of a problem, should there?

Bryan.TongMinh wrote:

It's not that much a problem, it's only inconsistent. It causes some problems in Python for scripts that assume that the API returns a dict {}. It is easily worked around though, because both [] and {} are equal in a boolean sense.

Resolving as LATER. [] is legal syntax, but inconsistent. If Python can't handle [], that's a problem in their JSON interpreter. We could fix this in theory, but it's so much hassle that it's only practical to do if the formatting code as a whole is revised.

Still valid in 1.21wmf6.
[Removing RESOLVED LATER as discussed in http://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064240.html . Reopening and setting priority to "Lowest". For future reference, please use either RESOLVED WONTFIX (for issues that will not be fixed), or simply set lowest priority. Thanks a lot!]

  • Bug 49201 has been marked as a duplicate of this bug. ***

(In reply to comment #6)

Resolving as LATER. [] is legal syntax, but inconsistent. If Python can't
handle [], that's a problem in their JSON interpreter. We could fix this in
theory, but it's so much hassle that it's only practical to do if the
formatting code as a whole is revised.

Just to clarify on this old bug -- the Python JSON interpreter can handle [] just fine, but it creates a *list* (equivalent to JS *array*) instead of a *dict* (equivalent to JS *object*).

This can cause problems if you do some kind of operation on what you expected would be a dict/object/map/hash/associative array.

This is not Python-specific, but could also cause problems in Java, Objective-C, C++, and ... basically anything but PHP .... especially for those with strong typing, you might even throw an error when simply assigning an extracted object to a variable of what turns out to be the wrong type.

A few possible ways to resolve this, should we care to:

a) use objects instead of associative arrays in the PHP code in the first place

b) use a special array key value that the JSON formatter knows to drop from the output, that forces an otherwise empty array to be treated as an associative array

c) create a special object type that wraps associative arrays in the API output, that the JSON formatter knows how to deal with

Under PHP 5.4, option c) can be combined with the JsonSerializable interface to ensure that a wrapped associative array outputs the way we want it even if people use json_encode() directly, so this could be used outside the API as well.

Combine that with the array-like object interface, and the array wrapper also becomes transparent to array-like uses in PHP. I think this is my favorite method...

$data['pages'] = $someArray;
// ...
echo json_encode( $data ); // {"pages":[]} OOPS!

fix to:

// wrap an array that might be empty
$data['pages'] = new JsonAssociativeArray( $someArray );
// ...
echo json_encode( $data ); // {"pages":{}}

or

// start with an empty array
$foo = new JsonAssociativeArray();
if ( $condition ) {
  $foo["bar"] = "baz";
}
$data['pages'] = $foo;
// ...
echo json_encode( $data ); // {"pages":{}}

Couldn't we check for the '_element' key (normally used by the XML formatter) and treat otherwise empty PHP arrays as arrays if it's present, and as objects if it's not? (Note that I have no idea what I'm talking about.)

Probably. Although I'd really like to get rid of the need for that "_element" key entirely.

a) use objects instead of associative arrays in the PHP code in the first place

Or perhaps just cast empty arrays to objects.

  • Bug 66720 has been marked as a duplicate of this bug. ***

Change 191103 had a related patch set uploaded (by Anomie):
Change API result data structure to be cleaner in new formats

https://gerrit.wikimedia.org/r/191103

Patch-For-Review

Anomie set Security to None.

Change 191103 merged by jenkins-bot:
Change API result data structure to be cleaner in new formats

https://gerrit.wikimedia.org/r/191103

Change 205714 had a related patch set uploaded (by Legoktm):
Change API result data structure to be cleaner in new formats

https://gerrit.wikimedia.org/r/205714

Change 205714 merged by jenkins-bot:
Change API result data structure to be cleaner in new formats

https://gerrit.wikimedia.org/r/205714

Ricordisamoa removed subscribers: gerritbot, Unknown Object (MLST).