Page MenuHomePhabricator

UnicodeDecodeError in encode_url
Closed, ResolvedPublic

Description

A unicode error is thrown when saving a page with e.g. an emdash inside (\u2014):

Exception in thread Put-Thread:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/user/python/core/pywikibot/__init__.py", line 745, in async_manager
    request(*args, **kwargs)
  File "/home/user/python/core/pywikibot/page.py", line 1139, in _save
    watch=watch, bot=botflag, **kwargs)
  File "/home/user/python/core/pywikibot/site.py", line 1297, in callee
    return fn(self, *args, **kwargs)
  File "/home/user/python/core/pywikibot/site.py", line 4745, in editpage
    result = req.submit()
  File "/home/user/python/core/pywikibot/data/api.py", line 1913, in submit
    paramstring = self._http_param_string()
  File "/home/user/python/core/pywikibot/data/api.py", line 1767, in _http_param_string
    return encode_url(self._encoded_items())
  File "/home/user/python/core/pywikibot/data/api.py", line 3050, in encode_url
    query = [(pair[0], pair[1].encode('utf-8')) for pair in query]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3091: ordinal not in range(128)

Event Timeline

Mpaa raised the priority of this task from to Needs Triage.
Mpaa updated the task description. (Show Details)
Mpaa added a project: Pywikibot.
Mpaa subscribed.

Change 258685 had a related patch set uploaded (by Mpaa):
api.py: fix UnicodeError in url_encode()

https://gerrit.wikimedia.org/r/258685

XZise set Security to None.

Another bug caused by T85321. Sorry ;-(

The problem is Request has already encoded the parameters to bytes using the site encoding, whereas BaseSite.urlEncode hasnt.
We could add encoding to BaseSite.urlEncode (and remove it from encode_url as @Mpaa has done)

I have two other possible approaches to 'fix' it

https://github.com/jayvdb/pywikibot-core/commit/f5a8d8d3
https://github.com/jayvdb/pywikibot-core/commit/fb1cc5ca0

Change 258685 merged by jenkins-bot:
api.py: fix UnicodeError in url_encode()

https://gerrit.wikimedia.org/r/258685