Page MenuHomePhabricator

Allpages generator query combined with prop=langlinks leads to incorrect results
Closed, InvalidPublic

Description

To get a list of all pages with their langlinks I use the following query:
https://nl.wikipedia.org/w/api.php?action=query&prop=langlinks&format=xml&lllimit=500&generator=allpages&gapfilterredir=nonredirects&gaplimit=500&gapfrom=Opland
Take note that "Opland" is just an example

I expect that up until the article that is referred to in the tag "continue llcontinue" it will show all langlinks for all pages.
However for example the article "Oplegslot" show no langlinks in the generated list. However if you check the article: https://nl.wikipedia.org/wiki/Oplegslot you can see that it has 3 langlinks. These are not shown in the generator.
I know that you cannot get the result of all pages in one call so the software has to cut off the result somewhere but I would expect that up until the cutoff point the results would be correct. Now it sees that the langlinks are randomly shown or not.

Event Timeline

Anomie subscribed.

I expect that up until the article that is referred to in the tag "continue llcontinue" it will show all langlinks for all pages.
However for example the article "Oplegslot" show no langlinks in the generated list.

Note that values for 'continue' and 'llcontinue' are intended to be opaque data, the format of which may change at any time, so you shouldn't be trying to see which page might be referred to in there. That said, at the current time for that particular query llcontinue does refer to a page, and the result is indeed giving you langlinks for all pages up to that page. The thing you're missing is that it does so in order by pageid: llcontinue specifies page ID 335463, while Oplegslot has page ID 4330222.

More generally, each prop module produces its results in some order, but that ordering is not part of the module's guarantees and might differ for different values of the module's parameters or for different versions of MediaWiki in order for the module to efficiently perform the necessary database queries.

When using generators and prop modules, you should take advantage of the 'batchcomplete' flag in the result as described at https://www.mediawiki.org/wiki/API:Query#batchcomplete to collect all the properties for each batch of pages from the generator, and only process the batch after having collected all the properties.