Page MenuHomePhabricator

win32_unicode does not correctly handle different console configurations under Python 3
Closed, ResolvedPublic

Description

when running

py pwb.py listpages.py -cat:category > out.txt

I get

<Unicode redirected stdout>.write: TypeError('write() argument must be str, not bytes',)
Traceback (most recent call last):
  File "pwb.py", line 255, in <module>
    if not main():
  File "pwb.py", line 249, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 121, in run_python_file
    main_mod.__dict__)
  File ".\scripts\listpages.py", line 282, in <module>
    main()
  File ".\scripts\listpages.py", line 260, in main
    pywikibot.stdout(output_list[-1])
  File "C:\Users\User\Core\pywikibot\logging.py", line 146, in stdout
    logoutput(text, decoder, newline, STDOUT, **kwargs)
  File "C:\Users\User\Core\pywikibot\logging.py", line 107, in logoutput
    logger.log(_level, text, extra=context, **kwargs)
  File "C:\Python3\lib\logging\__init__.py", line 1345, in log
    self._log(level, msg, args, **kwargs)
  File "C:\Python3\lib\logging\__init__.py", line 1415, in _log
    self.handle(record)
  File "C:\Python3\lib\logging\__init__.py", line 1425, in handle
    self.callHandlers(record)
  File "C:\Python3\lib\logging\__init__.py", line 1487, in callHandlers
    hdlr.handle(record)
  File "C:\Python3\lib\logging\__init__.py", line 855, in handle
    self.emit(record)
  File "C:\Users\User\Core\pywikibot\userinterfaces\terminal_interface_base.py", line 526, in emit
    return self.UI.output(text, targetStream=self.stream)
  File "C:\Users\User\Core\pywikibot\userinterfaces\terminal_interface_base.py", line 245, in output
    self._print(text, targetStream)
  File "C:\Users\User\Core\pywikibot\userinterfaces\terminal_interface_base.py", line 184, in _print
    self._write(text, target_stream)
  File "C:\Users\User\Core\pywikibot\userinterfaces\terminal_interface_base.py", line 146, in _write
    target_stream.write(text)
  File "C:\Users\User\Core\pywikibot\userinterfaces\win32_unicode.py", line 126, in write
    self._stream.write(text)
TypeError: write() argument must be str, not bytes
CRITICAL: Closing network session.

Running on:

Pywikibot: [https] r-pywikibot-core.git (c3dc772, g7491, 2016/09/15, 08:55:43, n/a)
Release version: 3.0-dev
requests version: 2.11.1
  cacerts: C:\Python3\lib\site-packages\requests\cacert.pem
    certificate test: ok
Python: 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit (Intel)]

Event Timeline

At witch site are you working (lang/family)? I can't reproduce it.

That particular run was on thirdparty wiki. But just know I tested it on ['wikipedia']['cs'] and it gave same error.

C:\core>py pwb.py listpages.py -cat:"Obce_v_departementu_Essonne" > out.txt
<Unicode redirected stdout>.write: TypeError('write() argument must be str, not bytes',)
Traceback (most recent call last):
  File "pwb.py", line 255, in <module>
    if not main():
  File "pwb.py", line 249, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 121, in run_python_file
    main_mod.__dict__)
  File ".\scripts\listpages.py", line 282, in <module>
    main()
  File ".\scripts\listpages.py", line 260, in main
    pywikibot.stdout(output_list[-1])
  File "C:\core\pywikibot\logging.py", line 146, in stdout
    logoutput(text, decoder, newline, STDOUT, **kwargs)
  File "C:\core\pywikibot\logging.py", line 107, in logoutput
    logger.log(_level, text, extra=context, **kwargs)
  File "C:\Python3\lib\logging\__init__.py", line 1345, in log
    self._log(level, msg, args, **kwargs)
  File "C:\Python3\lib\logging\__init__.py", line 1415, in _log
    self.handle(record)
  File "C:\Python3\lib\logging\__init__.py", line 1425, in handle
    self.callHandlers(record)
  File "C:\Python3\lib\logging\__init__.py", line 1487, in callHandlers
    hdlr.handle(record)
  File "C:\Python3\lib\logging\__init__.py", line 855, in handle
    self.emit(record)
  File "C:\core\pywikibot\userinterfaces\terminal_interface_base.py", line 526, in emit
    return self.UI.output(text, targetStream=self.stream)
  File "C:\core\pywikibot\userinterfaces\terminal_interface_base.py", line 245, in output
    self._print(text, targetStream)
  File "C:\core\pywikibot\userinterfaces\terminal_interface_base.py", line 184, in _print
    self._write(text, target_stream)
  File "C:\core\pywikibot\userinterfaces\terminal_interface_base.py", line 146, in _write
    target_stream.write(text)
  File "C:\core\pywikibot\userinterfaces\win32_unicode.py", line 126, in write
    self._stream.write(text)
TypeError: write() argument must be str, not bytes
CRITICAL: Closing network session.

Hm, works for me:

C:\pwb\GIT\core>pwb.py listpages -lang:cs -cat:Obce_v_departementu_Essonne > out.txt
196 page(s) found

C:\pwb\GIT\core>

Did you run it with python 3? (I do as I stated in the description). Besides that I cant think of what could be the problem...

Change 314858 had a related patch set uploaded (by Merlijn van Deen):
win32_unicode: always use byte std streams

https://gerrit.wikimedia.org/r/314858

Xqt triaged this task as High priority.

Change 314858 merged by jenkins-bot:
win32_unicode: always use byte std streams

https://gerrit.wikimedia.org/r/314858

Bah. The issue is that there are too many different situations that are subtly different, but that breaks everything.

  1. the user uses the standard windows console. Here, we want to wrap stdout etc and use the unicode console output functions
  2. the user redirects the output to a file. Here, we want to write to the bytes-based interface so that we can output utf-8 (rather than the ansi charset)
  3. the user uses a non-standard console (pycharm, spyder, idle?). Here we probably just want to push text to the output.
  4. stdout/stderr are some sort of non-IO buffer (e.g. in tox). Here we also just want to push text to the output.

What happened here was that 2) was incorrectly handled (we tried to write bytes to the text buffer). My original patch fixed 2), but broke 4) (where there is no bytes buffer). Fixing 4) now seems to have broken 3).

So we need to take a step back and decide what interface win32_unicode offers. This was undefined (and unclear) before; I explicitly defined it as 'always bytes' in my patch. However, the calling functions seem to assume that it should be str, both in python 2 (i.e., bytes), and in python 3 (i.e., text). So possibly the interface of win32_unicode should be 'whatever the underlying python sys.sydout/sys.stderr is' -- bytes on py2, text on py3.

valhallasw renamed this task from TypeError when redirecting output of listpages.py to file to win32_unicode does not correctly handle different console configurations under Python 3.Oct 16 2016, 10:43 AM
valhallasw added a subscriber: Dalba.

IDLE on Python 3.5 reports:

Cannot find buffer interface for stdout, stderr or stdin.
Using (incorrect) text interfaces instead. This is likely to
break on non-ascii text.

(i.e. 'hey, I'm not patching your stdout and just passing it through without looking'), and this works without issues.

Xqt lowered the priority of this task from High to Low.

Don't think that this is finally resolved.

Change 319080 had a related patch set uploaded (by Dalba):
win32_unicode.py: Do not encode text in python 3

https://gerrit.wikimedia.org/r/319080

For me, the current patch also fails on redirecting output when [[ https://pypi.org/project/win_unicode_console/ | win-unicode-console ]] is installed:


G:\PATH>pywikibot-core\pwb.py listpages -lang:cs -cat:Obce_v_departementu_Essonne > out.txt
C:\Users\USERNAME\AppData\Local\Programs\Python\Python35\lib\site-packages\win_unicode_console\__init__.py:31: RuntimeWarning: sys.stdin.encoding == 'utf-8', whereas sys.stdout.encoding == 'cp1252', readline hook consumer may assume they are the same
  readline_hook.enable(use_pyreadline=use_pyreadline)
Cannot find buffer interface for stdout, stderr or stdin.
Using (incorrect) text interfaces instead. This is likely to
break on non-ascii text.
<Unicode redirected stdout>.write: TypeError('write() argument must be str, not bytes',)
Traceback (most recent call last):
  File "G:\PATH\pywikibot-core\pwb.py", line 255, in <module>
    if not main():
  File "G:\PATH\pywikibot-core\pwb.py", line 249, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "G:\PATH\pywikibot-core\pwb.py", line 121, in run_python_file
    main_mod.__dict__)
  File "G:\PATH\pywikibot-core\scripts\listpages.py", line 282, in <module>
    main()
  File "G:\PATH\pywikibot-core\scripts\listpages.py", line 260, in main
    pywikibot.stdout(output_list[-1])
  File "G:\PATH\pywikibot-core\pywikibot\logging.py", line 146, in stdout
    logoutput(text, decoder, newline, STDOUT, **kwargs)
  File "G:\PATH\pywikibot-core\pywikibot\logging.py", line 107, in logoutput
    logger.log(_level, text, extra=context, **kwargs)
  File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python35\lib\logging\__init__.py", line 1345, in log
    self._log(level, msg, args, **kwargs)
  File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python35\lib\logging\__init__.py", line 1415, in _log
    self.handle(record)
  File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python35\lib\logging\__init__.py", line 1425, in handle
    self.callHandlers(record)
  File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python35\lib\logging\__init__.py", line 1487, in callHandlers
    hdlr.handle(record)
  File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python35\lib\logging\__init__.py", line 855, in handle
    self.emit(record)
  File "G:\PATH\pywikibot-core\pywikibot\userinterfaces\terminal_interface_base.py", line 526, in emit
    return self.UI.output(text, targetStream=self.stream)
  File "G:\PATH\pywikibot-core\pywikibot\userinterfaces\terminal_interface_base.py", line 245, in output
    self._print(text, targetStream)
  File "G:\PATH\pywikibot-core\pywikibot\userinterfaces\terminal_interface_base.py", line 184, in _print
    self._write(text, target_stream)
  File "G:\PATH\pywikibot-core\pywikibot\userinterfaces\terminal_interface_base.py", line 146, in _write
    target_stream.write(text)
  File "G:\PATH\pywikibot-core\pywikibot\userinterfaces\win32_unicode.py", line 150, in write
    self._stream.write(text)
TypeError: write() argument must be str, not bytes
CRITICAL: Closing network session.

Therefore I guess that it also won't work on Python 3.6 where PEP 528 is accepted and implements essentially the same modifications as win_unicode_console.

I've uploaded a new patch. To summarize:

win_unicode_consolepatchresult
disabledold*TypeError:write() argument must be str, not bytes
disabledcurrent*OK
enabledoldTypeError: write() argument must be str, not bytes
enabledcurrentTypeError: write() argument must be str, not bytes
disablednew*OK
enablednewOK

Change 319080 merged by jenkins-bot:
win32_unicode.py: Do not encode text in python 3

https://gerrit.wikimedia.org/r/319080