Page MenuHomePhabricator

[EPIC] Support "second-try" transliteration or wrong-keyboard searches (aka N.O.R.M.)
Open, HighPublic

Description

Normalizing Orthographic Re-Mapper (aka N.O.R.M.)

Build out the necessary infrastructure to support various kinds of text-mapping "second-try" searches, including "DWIM"-style wrong-keyboard searches (i.e., accidentally typing on a Russian/Cyrillic on a US/Latin keyboard) and transliterated searches (i.e., typing Georgian or Hindi in Latin script).

A good place to start is replicating the Russian and Hebrew DWIM gadget's autocomplete results enhancement, and then extending that breadth-first to Georgian and Hindi transliteration in autocomplete, or depth-first into full-text results.

wrong keyboard tickets:

translteration tickets:

Note: Naming is hard. DWIM ("do what I mean") is/was an on-wiki gadget that supported wrong-keyboard searches on Russian and Hebrew wikis. However, it sounds a little too much like DYM ("did you mean"), our query reformulation suggestion feature. We've used second-chance and second-try in the past to refer to a number of related approaches that are a superset of what is under consideration here. Hence "N.O.R.M.", the Normalizing Orthographic Re-Mapper, which would be a shared infrastructure that would allow us to convert both Fhbcnjntkm to Аристотель ("Aristotle") on Russian wikis and devanagari ka itihas to देवनागरी का इतिहास ("history of Devanagari") on Hindi wikis in a variety of useful ways.

Previous on-wiki write ups: