Page MenuHomePhabricator

Update TextCat with wrong-keyboard models
Closed, ResolvedPublic

Description

Add wrong-keyboard–transformed language models for "Cyrillic English" (en_cyr) and "Latin Russian" (ru_lat) to TextCat, both to the query-based (LM-query/) and wikitext-based (LM/) models. Also add Windows-1251 wrong-encoding model (ru_win1251) to the wikitext-based models.

Event Timeline

TJones triaged this task as High priority.Jan 16 2019, 3:14 PM
TJones created this task.

Change 484752 had a related patch set uploaded (by Tjones; owner: Tjones):
[wikimedia/textcat@master] Add Wrong-Keyboard and Wrong-Encoding Models to TextCat

https://gerrit.wikimedia.org/r/484752

Updated Perl models on GitHub. PHP models are awaiting review in the patch above.

Change 484752 merged by jenkins-bot:
[wikimedia/textcat@master] Add Wrong-Keyboard and Wrong-Encoding Models to TextCat

https://gerrit.wikimedia.org/r/484752