Page MenuHomePhabricator

Wikimetrics crashes when cohort description has special characters {dove} [5pts]
Closed, ResolvedPublic

Description

When uploading a cohort with special characters in its description, the upload form works fine.
But from there on, all cohort/list requests fail because of encoding conversion problems. Ha!

Event Timeline

mforns raised the priority of this task from to Needs Triage.
mforns updated the task description. (Show Details)
mforns subscribed.
kevinator triaged this task as Unbreak Now! priority.May 29 2015, 5:34 PM
kevinator set Security to None.

Just to make it clear, "special characters" means most likely any non-Latin 1 character.

mforns renamed this task from Wikimetrics crashes when cohort description has special characters to Wikimetrics crashes when cohort description has special characters [5pts].Jun 1 2015, 3:21 PM
mforns claimed this task.
mforns moved this task from Next Up to Tasked_Hidden on the Analytics-Kanban board.
mforns renamed this task from Wikimetrics crashes when cohort description has special characters [5pts] to Wikimetrics crashes when cohort description has special characters {dove} [5pts].Jun 1 2015, 3:42 PM

Change 215200 had a related patch set uploaded (by Mforns):
Fix cohort description utf8 bug

https://gerrit.wikimedia.org/r/215200

Change 215200 merged by Milimetric:
Fix cohort description utf8 bug

https://gerrit.wikimedia.org/r/215200

So Kevin discovered that the cohorts that have special characters in its descriptions are still crashing Wikimetrics.
Namely, all reports on those kind of cohorts are not working. When launching them, Wikimetrics alerts with the following error:

Error! Wikimetrics is experiencing problems. Visit the Support page for help if this persists. You can also check the console for details.

Change 217315 had a related patch set uploaded (by Mforns):
Fix cohort description utf8 bug (2)

https://gerrit.wikimedia.org/r/217315

Change 217315 merged by Milimetric:
Fix cohort description utf8 bug (2)

https://gerrit.wikimedia.org/r/217315

I tested it it staging and then deployed to production... and boom! It crashed in report_list.

The problem is that db collations are different in vagrant/staging vs. production.
The swedish_latin1 collation in vagrant/staging was masking a problem and made it difficult to identify in our tests.
When deployed to production (utf8_general_ci) the issue becomes evident.

Next step: Change the column type of report.parameters to varbinary, via a migration.
This should fix the issue.

Change 219373 had a related patch set uploaded (by Mforns):
Fix cohort description utf8 bug (3)

https://gerrit.wikimedia.org/r/219373

Change 219373 merged by Madhuvishy:
Fix cohort description utf8 bug (3)

https://gerrit.wikimedia.org/r/219373

kevinator subscribed.

verified it works in production :-)