steps to reproduce
Make an API request for a large Category and iterate with continue.
Eventually one does hit a loop with the gcmcontinue value . (you get the same continue value from an earlier query)
Sometimes the loop occures after several requests, in this example it is within the same request
Request
{
"action": "query",
"format": "json",
"generator": "categorymembers",
"gcmtitle": "Category:Nazi_symbols_status",
"gcmprop": "ids|title|sortkey",
"gcmtype": "file",
"gcmcontinue": "file|4445555453434845532052454943485347455345545a424c41545420333954322030313320303237392e4a5047|77119651",
"gcmlimit": "500"
}
in ApiSandbox
Response
{ "batchcomplete": "", "continue": { "gcmcontinue": "file|4445555453434845532052454943485347455345545a424c41545420333954322030323220303738302e4a5047|77119651", "continue": "gcmcontinue||" }, "query": { "pages": { "77117729": { "pageid": 77117729, "ns": 6, "title": "File:Deutsches Reichsgesetzblatt 39T2 013 0280.jpg"
The issue exists with and with and without extmetadata
{
"action": "query",
"format": "json",
"prop": "imageinfo",
"generator": "categorymembers",
"iiprop": "extmetadata|url",
"gcmtitle": "Category:Nazi_symbols_status",
"gcmprop": "ids|title|sortkey",
"gcmtype": "file",
"gcmcontinue": "file|4445555453434845532052454943485347455345545a424c41545420333954322030313320303237392e4a5047|77119651",
"gcmlimit": "500"
}
Python example
import sys import requests import json # enter the number for continue (the last digitas after the last "|") # here as contval value contval="" # or as comand line argument eg. "python circular_api.py 77119651" url = 'https://commons.wikimedia.org/w/api.php' title="Category:Nazi_symbols_status" params = dict( action='query', format= "json", # prop= "imageinfo", generator="categorymembers", # iiprop= "extmetadata|url", gcmtitle=title, gcmprop= "ids|title|sortkey", gcmtype= "file", gcmcontinue= "file|4445555453434845532052454943485347455345545a424c41545420333954322030313320303237392e4a5047|77119651", gcmlimit="500", ) if len(sys.argv)>1: params["gcmcontinue"]="file|4445555453434845532052454943485347455345545a424c41545420333954322030313320303237392e4a5047|"+sys.argv[1] if contval: params["gcmcontinue"]=str(contval) print("Used Parameters") print(params) resp = requests.get(url=url, params=params) json_asdict=resp.json() parsed = json.loads(resp.content) print("############################################") print("Response:") print("############################################") print(json.dumps(parsed, indent=4, sort_keys=True)[:1000])
Run in Google Collab
I know find this returned a different value in google collab.
But on my server, home computer or the api sandbox , the issue persists
What happens?:
instead of a new value for gcmcontinue being returned, the same value as specified in the api call is returned.
What should have happened instead?:
a different value for gcmcontinue should be returned by the api , to iterate further through the generator.
Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc.:
'https://commons.wikimedia.org/w/api.php'
server output
local execution
my original code was iterating fine through the responses for a while before the loop occured.
the loops also occured with smaller batch size (I tried 50)