Wikipedia portal: adjust the languages used for Chinese translations
Open, Stalled, LowPublic

Description

It looks like we might be handling the Chinese translations a bit off - we currently only look at the primary part of the language code from translatewiki zh instead of zh-{{variant}} so we're not showing all possible translations, such as zh-yeu for Cantonese.

Let's take a look and see what we can do about this.

debt created this task.Jul 25 2017, 8:08 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 25 2017, 8:08 PM

Change 374535 had a related patch set uploaded (by Jdrewniak; owner: Jdrewniak):
[wikimedia/portals@master] Verifying l10n file exists before ajax request.

https://gerrit.wikimedia.org/r/374535

Change 383337 had a related patch set uploaded (by Jdrewniak; owner: Jdrewniak):
[wikimedia/portals@master] Exposing available translations in JS variable

https://gerrit.wikimedia.org/r/383337

Change 383338 had a related patch set uploaded (by Jdrewniak; owner: Jdrewniak):
[wikimedia/portals@master] Checking if l10n available before translating page

https://gerrit.wikimedia.org/r/383338

Change 383339 had a related patch set uploaded (by Jdrewniak; owner: Jdrewniak):
[wikimedia/portals@master] Use the browsers full language codes for translation

https://gerrit.wikimedia.org/r/383339

Change 383340 had a related patch set uploaded (by Jdrewniak; owner: Jdrewniak):
[wikimedia/portals@master] Exposing zh variant translations

https://gerrit.wikimedia.org/r/383340

Change 374535 abandoned by Jdrewniak:
Verifying l10n file exists before ajax request.

Reason:
abandoning this patch in favor of the patches on this topic https://gerrit.wikimedia.org/r/#/q/topic:T171647-exposing-chinese-variants

https://gerrit.wikimedia.org/r/374535

mxn added a subscriber: mxn.Oct 11 2017, 4:56 AM
mxn added a comment.EditedOct 11 2017, 5:12 AM

The way wm-portal.js handled the Chinese localization was by baking the Traditional Chinese strings into the page, along with Simplified Chinese versions in data-convert-hans and data-converttitle-hans attributes; convertChinese() would then swap in the Simplified Chinese strings if the browser’s language was zh-hans, zh-cn, zh-sg, or zh-my.

None of the major browsers have an official translation in Min Nan, so I’d expect few users to have configured their browsers to prefer Min Nan independently of the UI language. That said, I think it’s important to set lang="nan": lang="zh-min-nan" causes browsers to choose CJK fonts, whereas the Min Nan Wikipedia uses the Latin-based Pe̍h-ōe-jī alphabet exclusively. Someday browsers may recognize standard three-letter language codes for font substitution, but that certainly won’t happen for ad-hoc codes like zh-min-nan.

mxn added a comment.Oct 11 2017, 5:47 AM

We created the JSON files in l10n from Module:Project portal/wikis, naming the files based on wiki subdomains (e.g., zh-yue.json). Meanwhile, translatewiki.net lays down JSON files named for ISO 639 codes (hence yue.json). So for some languages, we have some properties in one file and some in another. Had anyone translated the strings into Literary Chinese (lzh) at translatewiki.net, we would’ve had both zh-classical.json and lzh.json for the same wiki.

@mxn wrote:

... if the browser’s language was zh-hans, zh-cn, zh-sg, or zh-my.

I don't think zh-my is needed as no actual users will use it.

@mxn good point on lang="nan" I'll revert that back in the patch here.
Also, to your point on yue.json and zh-yue.json, it looks like in this specific instance the translation strings don't collide, so these two files can be merged without much conflict, but in the future, if such a conflict did arise, like with lzh, I suppose the newer file could take precedence?

As for the language codes, I apologize for my unfamiliarity with what the correct mappings are. In chrome for example, zh-tw is described as "Chinese (traditional)" which I assumed could be mapped to the zh-classical wiki. As is stands though, I'm still not sure for what browser language codes we should be showing zh-classical wiki in the top ten.

For the rest of the Chinese language codes, would the following mapping be correct?

browser codetranslation file
zh zh-hantzh-hant.json
zh-hans zh-cn zh-sgzh-hans.json
zh-hkzh-yue.json + yue.json

Currently, we're only serving the zh.json file for anyone who has a zh-* browser language. The zh-hant.json file seems to be a more complete translation than the zh.json file, would it be preferable to serve that in place of the zh.json? or should we combine the two?

debt changed the task status from Open to Stalled.Nov 2 2017, 3:52 PM

If there is anyone that can help with this, we'd really appreciate the feedback! :)

debt lowered the priority of this task from Normal to Low.

Moving to the backlog -- as we don't have a clear path forward at this time on how to deal with all the various Chinese language variants.

Moving to the backlog board for review at a future date.

debt removed Jdrewniak as the assignee of this task.Dec 12 2017, 4:16 PM
debt added a subscriber: Jdrewniak.