Page MenuHomePhabricator

git review dies if non-utf8 text as commit subject
Closed, ResolvedPublic

Description

twn:/resources/nike/translatewiki/puppet/modules/nginx/files/sites (2013/puppet)$ git review
Traceback (most recent call last):
  File "/usr/local/bin/git-review", line 1168, in <module>
    main()
  File "/usr/local/bin/git-review", line 1123, in main
    assert_one_change(remote, branch, yes, have_hook)
  File "/usr/local/bin/git-review", line 545, in assert_one_change
    (status, output) = run_command_status(cmd)
  File "/usr/local/bin/git-review", line 122, in run_command_status
    out = out.decode('utf-8')
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 209-212: invalid data

In git log (truncated from both ends):

Author: Siebrand Mazeland <s.mazeland@xs4all.nl>
Date:   Sun Apr 21 18:36:26 2013 +0200

    Fix encoding
    
    Change-Id: I7e59a42be59f5df6965b8e83ec2534f3736d7cb3

commit 270d0644c61fd4f56e0acbd2442400227972647c
Author: Siebrand Mazeland <s.mazeland@xs4all.nl>
Date:   Sun Apr 21 18:27:03 2013 +0200

    Fix paths for Vicuna Uploader
    
    Change-Id: Ib523d6aa368801ca7049d0b01fb8c2fa769fd664

commit a08134502d182caec86aeaf6585c5b67d10ce357
Author: Siebrand Mazeland <s.mazeland@xs4all.nl>
Date:   Sun Apr 21 18:04:57 2013 +0200

    Updates for supporting Vicu<F1>aUploader
    
    Change-Id: Ibf28241e31d6803f14c2dc1f581ebd46f064677e

Then I remember that we still and again have the bug where I have to write git fetch origin or git review will try submit unrelated commits to gerrit.

Ideally someone should work with upstream to fix both of these issues asap.

Details

Reference
bz47633

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:36 AM
bzimport added projects: Gerrit, I18n, Upstream.
bzimport set Reference to bz47633.
bzimport added a subscriber: Unknown Object (MLST).

Hrm, well I'd actually kinda rather have it die than submit non-UTF8 text into the system...

Why is the commit message being done in an 8-bit encoding?

What would constitute a proper behaviour in this case? We need to somehow know (currently assuming UTF-8) how to interpret output from external commands (this includes git and ssh gerrit interface) as Unicode strings.

hashar set Security to None.
hashar claimed this task.
hashar subscribed.

The issue is actually in the git-review wrapper which attempts to decode as utf8 the commit message.

I have tried with a commit summary containing xF1 or ñ and git-review 1.24.1 got it through.

So seems fine to me?