**User Story:** As an on-wiki searcher, I want to be able to search for words that have apostrophes in them without having to know or worry about what apostrophe-like character is actually used. For example, at least seven different characters are used on various projects in the name of the city in Yemen: Ma'rib, Maʿrib, Maʾrib, Maʼrib, Ma`rib, Ma’rib, Ma‘rib.
**Notes:** We have a new character filter, `apostrophe_norm`, currently configured for use only on Nias Wikipedia, which converts the other six options to the straight apostrophe.
There is a lot of cross-wiki inconsistency in how these characters are treated, too. The table below shows how the characters are analyzed in English, Japanese, and French Wikis. The standard tokenizer splits on backticks (` U+0060) so that always gets split into two words (//ma// is a stop word in French, so it gets dropped).
English has the `aggressive_splitting` filter enabled, which splits on three of the other characters (left and right curly apostrophes and the straight apostrophe). `icu_folding` removes the left and right half rings in English and French, though French has the "preserve" variant, which keeps the original, too. `icu_folding` also straightens the curly apostrophes in French, but `aggressive_splitting` has already split on them in English.
|**char**|**U+0027**|**U+02BF**|**U+02BE**|**U+02BC**|**U+0060**|**U+2019**|**U+2018**
|**input**|**Ma'rib**|**Maʿrib**|**Maʾrib**|**Maʼrib**|**Ma`rib**|**Ma’rib**|**Ma‘rib**
|**en**|ma, rib|marib|marib|marib|ma, rib|ma, rib|ma, rib
|**ja**|ma'rib|maʿrib|maʾrib|maʼrib|ma, rib|ma’rib|ma‘rib
|**fr**|ma'rib|marib/maʿrib|marib/maʾrib|ma'rib|(ma,) rib|ma'rib/ma’rib|ma'rib/ma‘rib
If we work on T219108, we should also consider removing apostrophes from `aggressive_splitting`.
**Acceptance Criteria:**
* `apostrophe_norm` is enabled everywhere (or at least by default, possibly with exceptions or customization for some languages for reasons as yet unknown)
* All of //Ma'rib, Maʿrib, Maʾrib, Maʼrib, Ma`rib, Ma’rib, Ma‘rib// index to the same form in all or almost all wikis (i.e., with intentional exceptions).
Note: this is a follow up to T311654, which looked at this issue for just one language (Nias).