Page MenuHomePhabricator Use response.iter_content
Closed, ResolvedPublic


Pywikibot is a Python-based framework to write bots for MediaWiki (more information).

Thanks to work in Google Code-in, Pywikibot now has a script called It downloads a Wikimedia database dump from, and places the dump in a predictable directory for semi-automated use by other scripts and tests.

As @zhuyifei1999 wrote in , the script should use response.iter_content instead of response.raw. Also, it should use stream=True when fetching the content.


You are expected to provide a patch in Wikimedia Gerrit. See for how to set up Git and Gerrit.

Event Timeline

Framawiki triaged this task as Medium priority.Dec 24 2017, 4:48 PM
Framawiki created this task.
Restricted Application added a subscriber: pywikibot-bugs-list. · View Herald TranscriptDec 24 2017, 4:58 PM
Aklapper updated the task description. (Show Details)Dec 24 2017, 6:41 PM
eflyjason updated the task description. (Show Details)Dec 25 2017, 1:04 AM
rafidaslam added a subscriber: rafidaslam.

Claimed on GCI

Change 400205 had a related patch set uploaded (by Rafidaslam; owner: rafid):
[pywikibot/core@master] download_dump: Use response.iter_content

Submitted the patch, suggestions are welcome, I'm a bit doubt about the chunk size though.. We can make it a constant for convenience I think

We can make it a constant for convenience I think

Yeah, it doesn't matter for most of the cases. When doing file copying/moving it's usually set to the block size of the filesystem, but for downloading I don't know of a convention, as long as it's not too small (like smaller than a KiB) or too large (like hundreds of MiB).

eflyjason closed this task as Resolved.Dec 27 2017, 12:47 AM

Change 400205 merged by jenkins-bot:
[pywikibot/core@master] download_dump: Use response.iter_content