Page MenuHomePhabricator

QueryGenerator and similar API classes do not handle page lists that are over the limit
Open, Needs TriagePublic

Description

If the query API module receives more pages (e.g. via the titles parameter) than the limit of the submodule used, it hsows a warning and discards the extra ones. QueryGenerator should work around this but currently doesn't (probably same goes to the other Generator and APISite classes).

One can do something like this for a workaround:

def chunked_iterator(lst, fn, size):
	return chain(*imap(fn, (lst[pos:pos+size] for pos in xrange(0, len(lst), size))))

for data in chunked_iterator(titles, lambda pages: api.PropertyGenerator('pageimages', titles=pages, site=site), 50):
    ...

but it's extra complexity and becomes even worse if one wants to preserve other features of the Generator classes like remembering pagename normalizations.

Event Timeline

@Xqt , do you know whether this is a regression against Pywikibot-compat ? Obviously not all of compat would have supported > 50 titles, but I suspect that some parts of it may have handled > 50 titles correctly.

@jayvdb: compat didn't use api for this but (old api) special export. api has been deactivated due to this bug:

# Sometimes query does not contains revisions
# or some pages are missing. Deactivate api call and use the
# old API special:export

see:

https://mediawiki.org/wiki/Special:Code/pywikipedia/8011
https://mediawiki.org/wiki/Special:Code/pywikipedia/8017
https://mediawiki.org/wiki/Special:Code/pywikipedia/8036
https://mediawiki.org/wiki/Special:Code/pywikipedia/11479
c1ee8f17da9f79212a2d736b1c79a8b36291679c