https://bn.wikipedia.org/wiki/%E0%A6%AC%E0%A6%BF%E0%A6%B6%E0%A7%87%E0%A6%B7:%E0%A6%AC%E0%A6%BF%E0%A6%B7%E0%A7%9F%E0%A6%AC%E0%A6%B8%E0%A7%8D%E0%A6%A4%E0%A7%81_%E0%A6%85%E0%A6%A8%E0%A7%81%E0%A6%AC%E0%A6%BE%E0%A6%A6 shows a message that the Special page does not exist. Same for Special:CXStats.
|mediawiki/extensions/ContentTranslation : master||Normalize special page aliases (bn) to a form MediaWiki can understand|
|mediawiki/extensions/ContentTranslation : master||Revert "ContentTranslation.alias.php translations for Bengali"|
Copying from chat to permament storage.
Basically [UtfNormal\Validator::cleanUp] normalization call forces everything in MediaWiki to use NFC. Except content from i18n files are trusted and not normalized. When we got most of these via translatewiki.net, this normalization was automatically applied, but with manual submission not.
- one possible fix is to call $wgContLang->normalize on input from PHP files
- a bigger question is whether NFC is the right thing for MediaWiki to use
- apply mine or Amir's patch in the short term (remember to rebuild l10n cache)
- consider adding safeguard for input from i18n files to avoid or catch these type of errors
- could also add unit test that verifies all i18n file contents... maybe better tradeoff than slowing runtime performance
Additional information: the character য় (U+09DF) was interpreted by MediaWiki as য (U+09AF) and nukta (U+09BC).
Using Amir's patch will revert the translation but the problem will recur if the strings are translated again.