Page MenuHomePhabricator

Quarry cannot save queries with emojies
Open, Needs TriagePublic

Description

A query like SELECT '😂'; gets truncated into SELECT ' after it's received by the quarry-internal database.

Doing more investigation after https://gerrit.wikimedia.org/r/#/c/436576/, definitions like VARCHAR(255) BINARY and TEXT BINARY are actually non-binary, but likely with a binary collation. To store the emojies we could change the database charset to utf8mb4 or binary, but:

  • the former causes an error in creating the user table ERROR 1071 (42000) at line 3: Specified key was too long; max key length is 767 bytes,
  • the latter makes sqlalchemy return python binary str instead of unicode, causing UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 8: ordinal not in range(128)

We could investigate using https://stackoverflow.com/a/43403017, but that probably needs some upgrades to our database server.

CC T194691 T192698

Event Timeline

CommunityTechBot renamed this task from mubaaaaaaa to Quarry cannot save queries with emojies.Jul 2 2018, 1:52 AM
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.

Regarding the second error- binary strings are not text, so they must be converted to python strings explicitly after driver execution.

For the first, I can help- most likely there is no need to upgrade, just to configure the database in the right way. Latest versions already have innodb_large_prefix enabled by default.