Page MenuHomePhabricator

API uses weird * (star/asterisk) key for the content property in JSON and PHP serialized formats
Closed, ResolvedPublic

Description

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Hello+World&rvlimit=1&rvprop=content%7Ctimestamp&format=json

returns:

{"timestamp":"2011-10-11T23:54:33Z","*":"A '''\"Hello world\" program''' is a [[computer program]] that prints out \"Hello world\" on...."}

for the query.pages[0].page.revisions[0]

Why is it assigning the content to a key called "*"?? This is unfortunate since * is an operator in Javascript, so you can't use the normal query.pages[0].page.revisions[0].*, and instead have to use query.pages[0].page.revisions[0]['*']. Why don't we assign it to a key called "rev" like in the XML version, or better yet "content" (the name of the property)? All of the other properties have regular key names so why not the content? Otherwise we are just giving JS developers a headache.


Version: unspecified
Severity: trivial

Details

Reference
bz31629
TitleReferenceAuthorSource BranchDest Branch
build: Build and publish arm64 images toorepos/abstract-wiki/wikifunctions/function-evaluator!28jforresterT336294main
build: Build and publish arm64 images toorepos/abstract-wiki/wikifunctions/function-orchestrator!32jforresterT336294main
Customize query in GitLab

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 11:53 PM
bzimport set Reference to bz31629.

I'm afraid that too many people already use *, we can't change it without a major b/c breakage.

That's the reply I get for every API bug I file. Can't we have an API 2.0 keyword or something? Right now our API is virtually unusable without a framework, unless you have a lot of time on your hands to figure out all the weird result cases (for example, boolean true returned as an empty string). At some point it would be nice if we created a clean API from scratch and launched it as an alternative and then gradually phased out the old API.

(In reply to comment #2)

That's the reply I get for every API bug I file. Can't we have an API 2.0
keyword or something?

Versioning the API might be possible, yes. That'd be a separate bug, though.

Right now our API is virtually unusable without a
framework, unless you have a lot of time on your hands to figure out all the
weird result cases (for example, boolean true returned as an empty string). At
some point it would be nice if we created a clean API from scratch and launched
it as an alternative and then gradually phased out the old API.

Maybe. The API should be stable. The fact that developers aim for stability is a feature, not a bug. Being required to put quotes around special operators in a programming language doesn't seem like a huge deal to me. Most languages require some escaping, don't they? & in HTML, for example. Is "*" a fairly poor name? Yes, I think so. It'd be interesting to figure out why it was picked (perhaps there's a logical, rational reason, you can check SVN), but hindsight is always going to be 20/20. Sometimes you have to make do with a bit of imperfection. If the use of "*" is causing actual bugs somewhere, that'd be a different story.

A version URL parameter might be feasible (I don't think a bug has been filed about this already, feel free to search/submit), but you also shouldn't expect no objections to it. Versioning adds a significant amount of code complexity, of course. Some people might be hesitant. Simply because people disagree with you on Bugzilla doesn't mean they're right. Sometimes it's simply a matter of making a better case. :-)

In any case, even with a version parameter, the default will likely always use "*" to avoid backward-compatibility issues, so this seems like a valid wontfix to me.

(In reply to comment #3)

It'd be interesting to figure out why it was picked
(perhaps there's a logical, rational reason, you can check SVN), but hindsight
is always going to be 20/20. Sometimes you have to make do with a bit of
imperfection. If the use of "*" is causing actual bugs somewhere, that'd be a
different story.

When the API was designed in 2006, XML was cool, so the API was primarily designed for XML output. Other output formats seem to have been added as an afterthought. In the years that followed, the world (or most of it, anyway), realized XML is an inflexible piece of garbage and moved to JSON. However, we're still stuck with the legacy of having to support XML which means that 1) we can't do certain things in JSON because they can't be translated to XML and 2) we have things like the * thing that look good in XML and look like crap in every other format on the planet.

I used to think the multi-format thing was cool, but I am now of the opinion that an API 2.0 should output JSON only. It's the only format worth bothering with, the others all suck to varying degrees.

Oh, there are plans for API 2.0?

(In reply to comment #5)

Oh, there are plans for API 2.0?

No, not really. I should probably have said something like "any API 2.0 project, if and when it happens".

I would really love to see most of our current formats go away:

  • I have no doubts that nobody uses wddx, txt and dump formats (some of them could be useful for debugging, but jsonfm is no less readable for this purpose)
  • While there can be existing YAML users, they should really use something less stillborn.
  • And dbg format with its advice to "Use PHP's eval() function to recover data" is especially evil.

(In reply to comment #7)

I would really love to see most of our current formats go away:

  • I have no doubts that nobody uses wddx, txt and dump formats (some of them

could be useful for debugging, but jsonfm is no less readable for this purpose)

  • While there can be existing YAML users, they should really use something less

stillborn.

  • And dbg format with its advice to "Use PHP's eval() function to recover data"

is especially evil.

Aye, even .plist (iirc has an easier API to work with data in iOS than JSON) would be more useful than most of those formats.

  • Bug 42888 has been marked as a duplicate of this bug. ***

The discussion above seems unanimous in admiting * was a poor naming choice and only supported for legacy reasons. How hard would it be to make the API output version strings? I mean without actually implementing any new features yet (such as dumping rarely used formats, etc).

The future looks bright! One question though: Why use '_' (which is a bit cryptic), rather than some generic word like 'content'? FWIW, Flickr's API uses '_content'.

I'm all for it - we have been discussing the name - see discussion page at http://www.mediawiki.org/wiki/Talk:Requests_for_comment/API_Future - just need a name that doesn't conflict with all the other fields that API and extensions generate.

Yurik removed Yurik as the assignee of this task.Dec 31 2014, 9:45 AM
Yurik set Security to None.

Change 182858 had a related patch set uploaded (by Anomie):
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/182858

Patch-For-Review

Change 182858 merged by jenkins-bot:
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/182858

Change 205707 had a related patch set uploaded (by Legoktm):
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/205707

Change 205707 merged by jenkins-bot:
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/205707