Page MenuHomePhabricator

Page's repr returns invalid data causing Python to error
Closed, DuplicatePublic

Description

In Python 2 it's not possible to add the representation of list containing a pages containing non-ASCII characters into a Unicode string:

>>> import pywikibot
>>> p = pywikibot.Page(pywikibot.Site(), u'öäöä')
>>> p.title()
u'\xf6\xe4\xf6\xe4'
>>> '%r' % ([p],)
'[Page(\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4)]'
>>> u'%r' % ([p],)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)

This is with 41d4254c which was the commit before merging unicode_literals. With unicode_literals it seems that repr doesn't correctly work at all:

>>> u'%r' % ([p],)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-12: ordinal not in range(128)
>>> '%r' % ([p],)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-12: ordinal not in range(128)
>>> repr(p)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-12: ordinal not in range(128)

Event Timeline

XZise raised the priority of this task from to Needs Triage.
XZise updated the task description. (Show Details)
XZise added a project: Pywikibot.
XZise added a subscriber: XZise.
Restricted Application added subscribers: Aklapper, Unknown Object (MLST). · View Herald TranscriptApr 11 2015, 11:27 AM

This bug is essentially the same as T66958.

Xqt triaged this task as High priority.Jun 23 2015, 4:32 AM
Xqt added a subscriber: Xqt.

Change 219618 had a related patch set uploaded (by XZise):
[bugfix] Workaround UnicodeDecodeError on api error

https://gerrit.wikimedia.org/r/219618

Change 220613 had a related patch set uploaded (by XZise):
[FEAT] page_tests: Page repr encoding test

https://gerrit.wikimedia.org/r/220613