Normalizing Orthographic Re-Mapper (aka N.O.R.M.)
Build out the necessary infrastructure to support various kinds of text-mapping "second-try" searches, including "DWIM"-style wrong-keyboard searches (i.e., accidentally typing on a Russian/Cyrillic on a US/Latin keyboard) and transliterated searches (i.e., typing Georgian or Hindi in Latin script).
A good place to start is replicating the Russian and Hebrew DWIM gadget's autocomplete results enhancement, and then extending that breadth-first to Georgian and Hindi transliteration in autocomplete, or depth-first into full-text results.
wrong keyboard tickets:
- T138958: Detect "wrong keyboard" queries for Russian/American keyboards on EN/RU Wikipedias
- T155104: Detect "wrong keyboard" queries for Hebrew/American keyboards on EN/HE Wikipedias
translteration tickets:
- T297761: Create a Latin-to-Devanagari transliteration second-chance search for Hindi wikis
- T127003: Transliterate Latin or Cyrillic script searches to Georgian script on Georgian wikis
Note: Naming is hard. DWIM ("do what I mean") is/was an on-wiki gadget that supported wrong-keyboard searches on Russian and Hebrew wikis. However, it sounds a little too much like DYM ("did you mean"), our query reformulation suggestion feature. We've used second-chance and second-try in the past to refer to a number of related approaches that are a superset of what is under consideration here. Hence "N.O.R.M.", the Normalizing Orthographic Re-Mapper, which would be a shared infrastructure that would allow us to convert both Fhbcnjntkm to Аристотель ("Aristotle") on Russian wikis and devanagari ka itihas to देवनागरी का इतिहास ("history of Devanagari") on Hindi wikis in a variety of useful ways.
Previous on-wiki write ups:
- Typing on the Wrong Keyboard—Russian and English
- DWIM as API
- Hindi Wikipedia Zero Results Queries (includes unsuccessful transliterated queries)