Page MenuHomePhabricator

Gerrit hooks need to accept unicode input
Closed, InvalidPublic

Description

Related to bug 35626. Since we've fixed that, I'm getting the following in gerrit's error log:

[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: Traceback (most recent call last):
[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: File "/var/lib/gerrit2/review_site/hooks/patchset-created", line 99, in <module>
[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: main()
[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: File "/var/lib/gerrit2/review_site/hooks/patchset-created", line 31, in main
[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: subject = helper.get_subject(options.change)
[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: File "/var/lib/gerrit2/review_site/hooks/hookhelper.py", line 55, in get_subject
[2012-05-03 17:11:49,971] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: subject = str(self.patchsets[change]['subject'])
[2012-05-03 17:11:49,972] INFO com.google.gerrit.common.ChangeHookRunner : hook[patchset-created] output: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 24: ordinal not in range(128)
[2012-05-03 17:13:17,210] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: Traceback (most recent call last):
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: File "/var/lib/gerrit2/review_site/hooks/change-merged", line 25, in <module>
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: main()
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: File "/var/lib/gerrit2/review_site/hooks/change-merged", line 22, in main
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: helper.update_rt(options.change, options.changeurl)
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: File "/var/lib/gerrit2/review_site/hooks/hookhelper.py", line 137, in update_rt
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: messages.append(self.get_subject(change))
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: File "/var/lib/gerrit2/review_site/hooks/hookhelper.py", line 55, in get_subject
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: subject = str(self.patchsets[change]['subject'])
[2012-05-03 17:13:17,211] INFO com.google.gerrit.common.ChangeHookRunner : hook[change-merged] output: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 24: ordinal not in range(128)


Version: unspecified
Severity: normal
URL: https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=tree;f=files/gerrit/hooks

Details

Reference
bz36487

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:20 AM
bzimport added a project: Gerrit.
bzimport set Reference to bz36487.
bzimport added a subscriber: Unknown Object (MLST).

Not an upstream bug, these hooks are in operations/puppet.git: /files/gerrit/hooks

Yes, all lines like

subject = str(this.patchsets[change]['subject'])

are incorrect - str() tries to convert string (in this case

<type 'unicode'> returned by json (for change 6596):

{u'status': u'MERGED', u'topic': u'master', u'url': u'https://gerrit.wikimedia.org/r/6596', u'createdOn': 1336127682, u'patchSets': [{u'createdOn': 1336127682, u'ref': u'refs/changes/96/6596/1', u'number': u'1', u'uploader': {u'name': u'Szymon \u015awierkosz', u'email': u'beau@adres.pl'}, u'revision': u'rMW6c27fef898ca'}, {u'createdOn': 1336127823, u'ref': u'refs/changes/96/6596/2', u'number': u'2', u'uploader': {u'name': u'Szymon \u015awierkosz', u'email': u'beau@adres.pl'}, u'revision': u'rMWcaa999a9f1b3'}, {u'createdOn': 1336313263, u'ref': u'refs/changes/96/6596/3', u'number': u'3', u'uploader': {u'name': u'Szymon \u015awierkosz', u'email': u'beau@adres.pl'}, u'revision': u'7cf5f38232111da6ec8834764628aa29422382c1'}], u'number': u'6596', u'lastUpdated': 1336313647, u'project': u'mediawiki/core', u'sortKey': u'001cdcf6000019c4', u'branch': u'master', u'owner': {u'name': u'Szymon \u015awierkosz', u'email': u'beau@adres.pl'}, u'open': False, u'id': u'Ie223930cfc313aff150e2dcfd70b74bf4360a8a8', u'subject': u'Move optionstoken from meta=userinfo to action=tokens.'}

and then we should consequently treat all strings as Unicode and converting to UTF-8 when logging to files or doing other I/O.

I wonder only, why did it work before?

Not a bug anymore since we got rid of these hooks.