Page MenuHomePhabricator

zh falls back to zh-hans, why?
Open, Needs TriagePublic

Description

Recently I'm working on a MediaWiki site with ULS and Translate extensions installed. The default content language of the site is en. I created a page in English, let's say, with the title Project:About. Then I translated the page into zh, with the hope that the language variant converter will do its own job. It works out well.

However, there is one thing that still bugs me. If I use MyLanguage redirect in the wikilink, i.e. Special:MyLanguage/Project:About, it will take me either to en or zh subpage, depending on my UI language. My current language setting is zh-cn, so I guess it would ultimately fallback to zh (esp. when I've taken a look at T50292). But I was kept taken to the English page.

After an investigation on File:MediaWiki_fallback_chains.svg, I surprisingly found that zh falls back to zh-hans, instead of the other way around, and the fallback chain is

  • zh-CN -> zh-Hans -> en
  • zh -> zh-Hans -> en

I'm not sure, but I guess that the reason why MW takes zh -> zh-hans fallback chain is that the translations of the MediaWiki software are either zh-hans, zh-hant (or zh-xxxx), but never zh alone. By taking this fallback, those who choose zh as their UI language will effectively see the UI in zh-hans. However, zh is a macro language, and IMO it should never be an option of UI language (I'm not talking about content language, which supports variant conversions).

So back to my problem, I'm just wondering if there is something I can do to fix this. Do I need to patch my MediaWiki code locally? And out of curiosity, why should zh falls back to zh-hans?

Event Timeline

CXuesong created this task.Dec 9 2017, 6:09 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 9 2017, 6:09 AM

My guess is because in CLDR, the Simplified content are just using zh instead of zh_hans

CXuesong added a comment.EditedDec 10 2017, 7:04 AM

My guess is because in CLDR, the Simplified content are just using zh instead of zh_hans

Thanks for your explanation. But while CLDR distinguishes between zh_hans and zh_hant, there seems no dedicated zh localized data. see the file tree here https://github.com/wikimedia/mediawiki-extensions-cldr/tree/master/CldrNames

I ended up simply changed the fallback order, i.e. the values of $fallback variables

  • In /languages/messages/MessagesZh_cn.php from "zh-hans" to "zh-hans, zh"
  • In /languages/messages/MessagesZh_tw.php from "zh-hant, zh-hans" to "zh-hant, zh-hans, zh"
  • and so on

And I've removed the $fallback variable in /languages/messages/MessagesZh.php, so that it would directly falls back to en

Now the Special:MyLanguage redirect works the same as my expectation. There is only one side effect, though, that users whose UILanguage is exactly zh will see English UI, but I think it's okay for me. Perhaps I will need to keep on patching the files each time after a version update now.