This idea was submitted to the current Inspire Campaign focused on new readers detailing an issue related to an inability for users in Cambodia to search Wikipedia using their native written language, Khmer. Here in an excerpt from the idea page itself (bolded text is from me) describing the issue in more detail:
For example, a word meaning 'eat' is: ញ៉ាំ pronounced nyarm.
It can be written with the following keystrokes. Note that the script on the left looks the same regardless. Note also that uppercass is achieved by SHIFT + keystroke, so the symbol " is created by SHIFT + '
ញ៉ាំ J"am -- Note that only this first spelling gets any results on Wikipedia, for the definition in the sister project Wikitionary.
Even though the scripts on the left look the same, if they are pasted into Wikipedia Search as a search terms, each version will generate completely different results. I am guessing that this is because Wikipedia indexes and searches based on the unicode sequence, not the resulting script.
To provide some concrete examples with links to search results on on Khmer Wikipedia (km.wikipeida.org):
- Search using ញ៉ាំ (one way to write the word for "eat"): search 1, providing appropriate results
- Search using ញាុំ (another way to write the word for "eat"): search 2, providing one search result not relevant to the meaning of the search term
- Search using ញំាុ (another way to write the word for "eat"): search 3, a null result.