Preserve the order of pages returned by the generator in the final API response
Open, Needs TriagePublic

Description

Generators return a list of pages in various ways (page IDs, page titles, revision IDs); these pages might then get reloaded from the database to obtain additional data, and in the process the order of the pages gets lost. While the results can be reordered on the client side (the way the set of results gets selected is based on the ordering used in the generator module, it's just the order inside the resultset that gets garbled), this is an inconvenience for the consumer, especially when it is not easy to use the same ordering in different environments (e.g. sort by title, which can contain unicode characters). The API should preserve whatever sorting the generator uses.

Tgr created this task.May 5 2015, 8:10 PM
Tgr updated the task description. (Show Details)
Tgr raised the priority of this task from to Needs Triage.
Tgr added a project: MediaWiki-API.
Tgr added a subscriber: Tgr.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 5 2015, 8:10 PM
Anomie added a subscriber: Anomie.May 5 2015, 8:29 PM

You'll mostly be working in ApiPageSet, although the final sort might have to be called externally.

Input to ApiPageSet might be page ids, revision ids, title strings, or Title objects. ApiPageSet might normalize the input and might generate additional pages internally (e.g. when the 'redirects' parameter is used).

And the items you're being asked to sort might not correspond well to the original input items. Further, deciding what's "right" when multiple input items at different positions in the input list correspond to a single output item isn't straightforward.

Tgr added a comment.May 5 2015, 8:48 PM

And the items you're being asked to sort might not correspond well to the original input items. Further, deciding what's "right" when multiple input items at different positions in the input list correspond to a single output item isn't straightforward.

I can see that the generator's result set and the actual result set are in many-to-one relation when the redirects parameter is used and there are multiple redirects pointing to the same title. (Same for converttitles as well, I guess?) Is that what you mean or are there other cases in which the relationship between the generator's and the final page set is tricky?

Anomie added a comment.May 5 2015, 9:01 PM
In T98205#1262420, @Tgr wrote:

or are there other cases in which the relationship between the generator's and the final page set is tricky?

When the generator produces revids, multiple revids might map to one page. It's unlikely that a generator would produce unnormalized or duplicate titles, but not impossible.