Page MenuHomePhabricator

Pywikibot reflinks.py doesn't work on Python 3 because of Unicode decoding error
Closed, ResolvedPublic

Description

reflinks.py does not work on Python 3

$ python3 pwb.py reflinks.py -family:wikipedia -lang:en -page:User:John_Vandenberg/test_T118674

Retrieving 1 pages from wikipedia:ru.
Traceback (most recent call last):
File "pwb.py", line 248, in <module>
  if not main():
File "pwb.py", line 242, in main
  run_python_file(filename, [filename] + args, argvu, file_package)
File "pwb.py", line 120, in run_python_file
  main_mod.__dict__)
File ".\scripts\reflinks.py", line 824, in <module>
  main()
File ".\scripts\reflinks.py", line 820, in main
  bot.run()
File ".\scripts\reflinks.py", line 525, in run
  f = urlopen(ref.url.decode("utf8"))
AttributeError: 'str' object has no attribute 'decode'
<class 'AttributeError'>
CRITICAL: Closing network session.

Original test case: https://ru.wikipedia.org/w/index.php?title=user:MaxBioHazard/test&oldid=74529895
Confirmed on Python 3.5 and 3.4

Event Timeline

MBH raised the priority of this task from to Needs Triage.
MBH updated the task description. (Show Details)
MBH added a project: Pywikibot.
MBH subscribed.

I'm unsure why RefLink.url needed decoding. At least in core, the page text returned from Page.get is unicode, and textlib.removeDisabledParts should also return unicode. Probably need to look at the script in core around that date to find why the page text / urls were raw bytes.

IMO RefLink.url should always be a unicode in Python 2 and a str in Python 3 , and this task should add tests for the RefLink class.

jayvdb renamed this task from Pywikibot reflinks.py doesn't work cause of Unicode decoding error to Pywikibot reflinks.py doesn't work on Python 3 because of Unicode decoding error.Jan 19 2016, 2:49 AM
jayvdb updated the task description. (Show Details)

Change 264251 had a related patch set uploaded (by MtDu):
Set user-agent and convert reflinks.py to use requests module

https://gerrit.wikimedia.org/r/264251

Change 264251 merged by jenkins-bot:
Set user-agent and convert reflinks.py to use requests

https://gerrit.wikimedia.org/r/264251

jayvdb assigned this task to MtDu.