Page MenuHomePhabricator

checkimages.py incorrectly casts many command line arguments to str()
Closed, DeclinedPublic

Description

C:\pwb\GIT\core>pwb.py checkimages -simulate -cat:Wikipedia:Dateiüberprüfung_(2016-04-01)
Traceback (most recent call last):
  File "C:\pwb\GIT\core\pwb.py", line 256, in <module>
    if not main():
  File "C:\pwb\GIT\core\pwb.py", line 250, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "C:\pwb\GIT\core\pwb.py", line 121, in run_python_file
    main_mod.__dict__)
  File ".\scripts\checkimages.py", line 1788, in <module>
    ret = main()
  File ".\scripts\checkimages.py", line 1703, in main
    catName = str(arg[5:])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 15:
 ordinal not in range(128)
<type 'exceptions.UnicodeEncodeError'>
CRITICAL: Closing network session.

Event Timeline

valhallasw renamed this task from -cat option fails with UnicodeDecodeError to checkimages.py incorrectly casts many command line arguments to str().Apr 2 2016, 2:11 PM
valhallasw subscribed.

For some reason, checkimages.py runs str(arg[5:]), which of course fails if arg is non-ascii. I'm not sure why it does that -- we can probably just remove those str()s.

Change 281131 had a related patch set uploaded (by Xqt):
[bugfix] bugfixes and improvements for checkimages

https://gerrit.wikimedia.org/r/281131

Change 281673 had a related patch set uploaded (by Xqt):
[bugfix] bugfixes and improvements for checkimages

https://gerrit.wikimedia.org/r/281673

Xqt triaged this task as Medium priority.Apr 6 2016, 1:19 PM

This problem is related to Python 2 where str() casts to byte string not unicode

Xqt removed Xqt as the assignee of this task.Oct 4 2020, 8:47 AM