Page MenuHomePhabricator

Evaluate need for Myanmar Zawgyi encoding detection/transliteration in search
Open, LowPublic

Description

Before considering some form of Zawgyi detection and transliteration for Myanmar-language wikis, we should:

  • get a sense of the frequency of Zawgyi-encoded queries
  • get a sense of the accuracy of Google’s detection library on short (i.e., query-length) strings
  • evaluate available transliteration tools and transliteration complexity
  • maybe evaluate other detection tools that would be more convenient to implement (like TextCat)
  • evaluate detection and transliteration on non-Myanmar text, too

I've also written up more details, adapted from a previous email conversation about this, in my notes on MediaWiki.