Page MenuHomePhabricator

API action=query&prop=revisions doesn't return diff in xml
Closed, ResolvedPublic

Description

When I execute this api:

:action=query&prop=revisions&format=json&rvprop=content&rvlimit=1&rvtoken=rollback&rvdiffto=prev&titles=User%3APetrb

it return content and diff

but if I execute xml:

:action=query&prop=revisions&format=xml&rvprop=content&rvlimit=1&rvtoken=rollback&rvdiffto=prev&titles=User%3APetrb

I only get a content of page.

This makes it necessary to execute 2 different queries in order to get same thing in xml which you can have from any other format


Version: unspecified
Severity: major

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:37 AM
bzimport set Reference to bz55371.
bzimport added a subscriber: Unknown Object (MLST).

this was tested on english wiki

Yet another bug with the XML format. Basically, it can't handle having both a "text content" (i.e. an item with key '*', from ApiResult::setContent()) and non-scalar sub-items at the same time. There's supposed to be an error thrown in this situation, but the test at ApiFormatXml.php line 166 is broken so it won't actually get thrown except in certain limited cases.

I can see three possible solutions here:

  1. Use a key such as 'content' instead of '*'. But that would break BC for all clients, even those not trying to use diff and content at the same time, so I don't much like that idea.
  2. Have the XML formatter shove '*' into a particular named subelement when other subelements are needed. Whether this subelement's name should be hardcoded or if we should allow/require something like setIndexedTagName() I'm not sure.
  3. WONTFIX it.

I don't like #1 because it would break BC for all clients, even those not trying to use diff and content at the same time. #2 feels like a hack, but it's the better option if we don't want to WONTFIX this.

FYI: I don't see any other situation in core that would be affected by this.

As for WMF-deployed extensions, CodeReview's list=coderevisions has the issue if 'message' is requested along with 'tags', 'followups', or 'followedup'. I don't see any other extensions that look like they'd be affected.

I don't really see what is so hard on fixing this, you can easily combine

<rev ...>content</rev>

and

<diff>content of diff</diff>

what's the problem?

what's the point of whole star thingie? Why not just name every property with some solid name instead of some weird characters? It wouldn't look so hackish but it would surely work just as fine...

(In reply to comment #4)

I don't really see what is so hard on fixing this, you can easily combine

<rev ...>content</rev>

and

<diff>content of diff</diff>

what's the problem?

How do you combine <rev>text...</rev> and <rev><diff>...</diff></rev>? <rev>text...<diff>...</diff></rev> or <rev><diff>...</diff>text...</rev> makes no sense in XML.

My option 1 is changing it to <rev content="text..."/>, so you could combine them easily as <rev content="text..."><diff>...</diff></rev>. My option 2 is doing basically the same (or <rev><something>text...</something><diff>...</diff></rev>) only when the XML formatter runs into this situation.

(In reply to comment #5)

what's the point of whole star thingie? Why not just name every property with
some solid name instead of some weird characters? It wouldn't look so hackish
but it would surely work just as fine...

The "star thingie" is how you get <rev>text...</rev> instead of <rev content="text..."/> in the XML format. And while it would probably have been a good idea to avoid it back in 2006, changing it now would break huge numbers of existing clients which is not something we'd like to do.

But you could still make some other way how to ask for a result which would look like this, so that it wouldn't break any tools that expect the current format.

In fact, even if it did

<rev>text</rev>

in case you don't ask for a diff

and

<rev content="text">

<diff>...</diff>

</rev>

in case you do ask for a diff

it would still hardly break anything, since as you said, the later should have thrown exception (which it doesn't because of some bug as you said: "There's supposed to be an error thrown
in this situation, but the test at ApiFormatXml.php line 166 is broken so it
won't actually get thrown except in certain limited cases.")

so if it produced a different kind of output instead of exception, how could it break anything?

  • Bug 66054 has been marked as a duplicate of this bug. ***

As indicated on https://www.mediawiki.org/wiki/API:Main_page#The_format there are plans to remove all output formats except for JSON, should this Bug be moved to a WON'T FIX status?

Is the statement about moving to JSON only output formats incorrect, or just far in the future?

Far in the future, if we ever get to it. But killing formats other than json, xml, and php is in progress now.

Change 182858 had a related patch set uploaded (by Anomie):
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/182858

Patch-For-Review

Anomie set Security to None.

Change 182858 merged by jenkins-bot:
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/182858

Change 205707 had a related patch set uploaded (by Legoktm):
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/205707

Change 205707 merged by jenkins-bot:
API: Overhaul ApiResult, make format=xml not throw, and add json formatversion

https://gerrit.wikimedia.org/r/205707