Page MenuHomePhabricator

Pywikibot reflinks.py doesn't work on Python 3 because of Unicode decoding error
Closed, ResolvedPublic

Description

reflinks.py does not work on Python 3

$ python3 pwb.py reflinks.py -family:wikipedia -lang:en -page:User:John_Vandenberg/test_T118674

Retrieving 1 pages from wikipedia:ru.
Traceback (most recent call last):
File "pwb.py", line 248, in <module>
  if not main():
File "pwb.py", line 242, in main
  run_python_file(filename, [filename] + args, argvu, file_package)
File "pwb.py", line 120, in run_python_file
  main_mod.__dict__)
File ".\scripts\reflinks.py", line 824, in <module>
  main()
File ".\scripts\reflinks.py", line 820, in main
  bot.run()
File ".\scripts\reflinks.py", line 525, in run
  f = urlopen(ref.url.decode("utf8"))
AttributeError: 'str' object has no attribute 'decode'
<class 'AttributeError'>
CRITICAL: Closing network session.

Original test case: https://ru.wikipedia.org/w/index.php?title=user:MaxBioHazard/test&oldid=74529895
Confirmed on Python 3.5 and 3.4

Details

Related Gerrit Patches:

Event Timeline

MBH created this task.Nov 15 2015, 4:46 AM
MBH raised the priority of this task from to Needs Triage.
MBH updated the task description. (Show Details)
MBH added a project: Pywikibot.
MBH added a subscriber: MBH.
Restricted Application added a subscriber: Base. · View Herald TranscriptJan 6 2016, 12:58 PM
jayvdb added a project: good first bug.
jayvdb added a subscriber: jayvdb.
Nemo_bis set Security to None.

Another bug which would be fixed by T111300: Convert reflinks to requests.

jayvdb added a comment.EditedJan 11 2016, 12:15 AM

The decode was added in compat e3537f4a to resolve https://sourceforge.net/p/pywikipediabot/bugs/1268/

I'm unsure why RefLink.url needed decoding. At least in core, the page text returned from Page.get is unicode, and textlib.removeDisabledParts should also return unicode. Probably need to look at the script in core around that date to find why the page text / urls were raw bytes.

IMO RefLink.url should always be a unicode in Python 2 and a str in Python 3 , and this task should add tests for the RefLink class.

jayvdb renamed this task from Pywikibot reflinks.py doesn't work cause of Unicode decoding error to Pywikibot reflinks.py doesn't work on Python 3 because of Unicode decoding error.Jan 19 2016, 2:49 AM
jayvdb updated the task description. (Show Details)
jayvdb moved this task from Backlog to references on the Pywikibot-Scripts board.Jan 19 2016, 2:53 AM

Change 264251 had a related patch set uploaded (by MtDu):
Set user-agent and convert reflinks.py to use requests module

https://gerrit.wikimedia.org/r/264251

Change 264251 merged by jenkins-bot:
Set user-agent and convert reflinks.py to use requests

https://gerrit.wikimedia.org/r/264251

jayvdb closed this task as Resolved.Jan 19 2016, 5:08 AM
jayvdb assigned this task to MtDu.