Page MenuHomePhabricator

use one library for all http requests
Open, HighPublic

Description

requests has been chosen as the http library for pywikibot v3.0 master.

There are a few cases of urllib.urlopen (and others) being used in the pywikibot library code, and a number of scripts which use other http request routines.
Multiple routines results in multiple configuration (e.g. proxy) and multiple sets of possible bugs/errors.

All http activity should be provided by utility methods in pywikibot.comms.http, so it is easy to test and support them, and possibly use a different http library in the future if necessary.

See Also: T71204

Details

Reference
bz66102

Event Timeline

bzimport raised the priority of this task from to Needs Triage.
bzimport set Reference to bz66102.
jayvdb created this task.Jun 4 2014, 12:00 AM
jayvdb added a comment.Jul 1 2014, 2:33 AM

The may be issues with using httplib2 for large downloads, like are possible in upload.py.

https://github.com/jcgregorio/httplib2/issues/224

A fork has been created for that, and distributed caching.

https://github.com/madlag/streaming_httplib2

jayvdb added a comment.Aug 6 2014, 9:07 PM

site.py & weblib.py use 'import urllib', but for urlencode

urllib:
pywikibot/page.py:1841: f = urllib.urlopen(self.fileUrl())
pywikibot/version.py:199: buf = urllib.urlopen(url).readlines()

scripts/upload.py
scripts/flickrripper.py
scripts/checkimages.py
scripts/weblinkchecker.py
scripts/imagerecat.py
scripts/maintenance/wikimedia_sites.py
scripts/data_ingestion.py

urllib2:
scripts/reflinks.py

httplib (not httplib2):
pywikibot/version.py:123: conn = httplib.HTTPSConnection('github.com')

scripts/weblinkchecker.py
scripts/reflinks.py

Change 152200 had a related patch set uploaded by John Vandenberg:
HTTP requests with user-agent without version

https://gerrit.wikimedia.org/r/152200

Change 153300 had a related patch set uploaded by John Vandenberg:
Replace httplib and urllib with httplib2

https://gerrit.wikimedia.org/r/153300

Change 152200 merged by jenkins-bot:
User-agent graceful degradation

https://gerrit.wikimedia.org/r/152200

Change 153300 merged by jenkins-bot:
Replace httplib and urllib with httplib2

https://gerrit.wikimedia.org/r/153300

version.py now uses httplib2.

In addition to the list above, generate_family_file.py also uses urllib2

jayvdb added a comment.Nov 9 2014, 2:47 AM

https://github.com/ross/python-asynchttp might be a good solution, but it doesnt appear to be very active

Another httplib2 fork, which says it provides streaming: https://github.com/fffonion/httplib2-plus

Also, we have a patch to switch to python-requests: https://gerrit.wikimedia.org/r/#/c/189821/

Change 208479 had a related patch set uploaded (by XZise):
[IMPROV] Replace openurl with http.fetch

https://gerrit.wikimedia.org/r/208479

Change 208479 merged by jenkins-bot:
[IMPROV] Replace openurl with http.fetch

https://gerrit.wikimedia.org/r/208479

jayvdb updated the task description. (Show Details)Jan 10 2016, 10:48 AM
jayvdb set Security to None.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptJan 10 2016, 10:48 AM
jayvdb triaged this task as High priority.Jan 10 2016, 10:48 AM
jayvdb removed a project: Patch-For-Review.

Change 281131 had a related patch set uploaded (by Xqt):
[bugfix] bugfixes and improvements for checkimages

https://gerrit.wikimedia.org/r/281131

Change 281673 had a related patch set uploaded (by Xqt):
[bugfix] bugfixes and improvements for checkimages

https://gerrit.wikimedia.org/r/281673

jayvdb added a comment.Jan 7 2018, 4:10 AM

Two new possible GCI tasks T184360 & T184361 . Need more analysis done, and expanding the description to ensure that the work is of a high quality.

Xqt updated the task description. (Show Details)May 6 2018, 8:23 AM
D3r1ck01 moved this task from Backlog to Needs Review on the Pywikibot board.Nov 5 2018, 11:38 AM
Xqt moved this task from Needs Review to Backlog on the Pywikibot board.Feb 3 2019, 12:27 PM
Xqt raised the priority of this task from High to Needs Triage.
Xqt triaged this task as High priority.Feb 4 2019, 5:27 AM