Add support to English and/or Russian Wikipedia for detecting and converting queries typed in one language on the other language's keyboard.
- //пукьфт сгшышту// (Russian phonetic transliteration: "puk'ft sgshyshtu") looks like gibberish; converting from Russian to American keyboard gives //german cuisine.//
- //qatktdf ,fiyz// looks like gibberish, but converting from American to Russian keyboard gives //эйфелева башня,// "Eiffel Tower".
More details and examples are [[ https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Typing_on_the_Wrong_Keyboard%E2%80%94Russian_and_English | here ]].
We can use TextCat language detection to detect these tranliteratable gibberish strings.
Additional requirements for successful implementation include (but are not limited to):
- possibly more data analysis limited to poorly performing queries (the analysis above is on all queries, and so overestimates the cost).
- more complex interaction with language detections, including paying attention to "second place" language results, filtering results //after// language detection (see notes with more details and examples above), and having differing behaviors for different languages (i.e., showing cross-wiki results for some languages, doing query re-writes or did you mean suggestions for other languages).
- coming up with a mechanism for dealing with multiple suggestions (e.g., this plus a spelling correction); possibilities include some sort of confidence score from each suggester, hard-coded ordering, or a nice display of multiple suggestions.