Unpack Turkish Elastic analyzer and implement apostrophe improvements in a plugin for simplicity, performance, and reusability by others.
The new plugin will need to be built and deployed, the config changes deployed, and the Turkish-language wikis reindexed.
Notes from https://phabricator.wikimedia.org/T325091#8619132
Found a problem with the apostrophe filter for Turkish, which is very aggressive and does bad things to French and Italian (which are common in names, sources, etc.). For example, d'Onofrio'nun, d'administration, d'administration'dan, and d'Arthur'unda all get indexed as plain d. Not optimal.
I've come up with a bunch of heuristics that improve the apostrophe processing. Implementing them as a collection of existing filters is a mess, so making a plugin seems like a good approach—it also makes the logic more easily reusable by others.
I'm going to spin off Turkish as its own ticket and finish up the other two first.