English-language wikis use `aggressive_splitting`, which is a language analysis filter (a version of Elasticsearch's [[ https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html#analysis-word-delimiter-tokenfilter | Word Delimiter Token Filter ]]) that splits words on case changes (as was the original issue in this ticket) and in other circumstances. Investigate applying it everywhere, or at least for many more languages.
---
Original task title & description:
**Cross-wiki search tokenizer is better than local search one**
[[https://fr.wikipedia.org/wiki/Sp%C3%A9cial:Recherche?search=FilesystemHierarchyStandard&sourceid=Mozilla-search | Searching for “FilesystemHierarchyStandard” in fr.wp]] give me no local result but several results from en.wp, including [en:Filesystem Hierarchy Standard] whereas equivalent [fr:Filesystem Hierarchy Standard] exists.
I’ve already encountered this strange issue: global search is sometimes better than local search, especially in phrase tokenization (when I missed spaces).
Maybe it’s because I use an English phrasing on French wiki?