Based on the testing that will be done in T147495 and T147501, let's figure out how best to deal with those languages that don't have spaces between words (using ICU tokenization).
We're currently expecting the A/B test (done in T147495) to be unsuccessful and show that we need to do this task. So, I've marked T147495 as a parent task.
Stalled, waiting on the outcome of the A/B test analysis in T147500.
Still have this on our radar....but have to figure out a few things first, keeping in the backlog at the bottom for now.
Closing this as we have solved the question on which languages to do and the analysis. We'll create new tickets as we start working on the languages.