Page MenuHomePhabricator

UnicodeDecodeError in encode_url
Closed, ResolvedPublic


A unicode error is thrown when saving a page with e.g. an emdash inside (\u2014):

Exception in thread Put-Thread:
Traceback (most recent call last):
  File "/usr/lib/python2.7/", line 810, in __bootstrap_inner
  File "/usr/lib/python2.7/", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/user/python/core/pywikibot/", line 745, in async_manager
    request(*args, **kwargs)
  File "/home/user/python/core/pywikibot/", line 1139, in _save
    watch=watch, bot=botflag, **kwargs)
  File "/home/user/python/core/pywikibot/", line 1297, in callee
    return fn(self, *args, **kwargs)
  File "/home/user/python/core/pywikibot/", line 4745, in editpage
    result = req.submit()
  File "/home/user/python/core/pywikibot/data/", line 1913, in submit
    paramstring = self._http_param_string()
  File "/home/user/python/core/pywikibot/data/", line 1767, in _http_param_string
    return encode_url(self._encoded_items())
  File "/home/user/python/core/pywikibot/data/", line 3050, in encode_url
    query = [(pair[0], pair[1].encode('utf-8')) for pair in query]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3091: ordinal not in range(128)

Event Timeline

Mpaa raised the priority of this task from to Needs Triage.
Mpaa updated the task description. (Show Details)
Mpaa added a project: Pywikibot.
Mpaa added a subscriber: Mpaa.

Change 258685 had a related patch set uploaded (by Mpaa): fix UnicodeError in url_encode()

XZise set Security to None.

Another bug caused by T85321. Sorry ;-(

The problem is Request has already encoded the parameters to bytes using the site encoding, whereas BaseSite.urlEncode hasnt.
We could add encoding to BaseSite.urlEncode (and remove it from encode_url as @Mpaa has done)

I have two other possible approaches to 'fix' it

Change 258685 merged by jenkins-bot: fix UnicodeError in url_encode()