Page MenuHomePhabricator

Add ICU_folding filter for EN, FR and EL wiki projects
Closed, ResolvedPublic


Along with the work we're doing on T137830 and T102298 (but slightly different than what T41501 details), we'll want to add ICU-folding (configuration flag) in for the English and French wiki projects.

This will involve re-running a few tests to be sure we're not breaking anything.

Event Timeline

debt created this task.Sep 22 2016, 6:34 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 22 2016, 6:34 PM
debt triaged this task as Medium priority.Sep 22 2016, 6:34 PM

Looks like we should turn this on! My analysis is here.

A lot of pronunciations for words get mapped onto those words now—that's cool!

A problem has been exposed: rarely, "modifier letter apostrophe" is used instead of a straight quote or curly right quote. Before ICU folding they were being indexed wrong; with this patch they will be indexed wrong, but in a different way. I've opened T146804 to map them, but it may not get done before the re-index happens.

debt assigned this task to TJones.Sep 29 2016, 5:32 PM
TJones reassigned this task from TJones to dcausse.Oct 3 2016, 3:30 PM
TJones added a comment.Oct 3 2016, 3:32 PM

I think this was and should be assigned to @dcausse . I probably should have created a separate task, but all I did was evaluate the effect of turning it on locally and analyzing big blobs of text. David knows how to configure it properly for deployment. (Unless someone wants me to do it.)

Change 313838 had a related patch set uploaded (by DCausse):
Enable ICU folding for en, fr and greek by default

Added greek to the list, I thought we agreed to enable it on fulltext search as well, currently it's only enabled on greek wikipedia with the completion suggester.

dcausse renamed this task from Add ICU_folding filter for EN and FR wiki projects to Add ICU_folding filter for EN, FR and EL wiki projects.Oct 11 2016, 5:37 PM

Change 313838 merged by jenkins-bot:
Enable ICU folding for en, fr and greek by default

Deskana closed this task as Resolved.Dec 9 2016, 3:29 PM