Page MenuHomePhabricator

Update our list of languages and variants.
Open, MediumPublic4 Estimated Story Points

Description

This is something that hasn't sat right with me for a while:

If we look at any article in Chinese (on desktop or mobile web), we are given an option to select the variant of Chinese for reading the article. The number of these variants is no less than six. So then, why do we offer only two variants in the app? Should we not also offer all the possible variants?

The API that we currently use for building our list of languages is the sitematrix API:

https://en.wikipedia.org/w/api.php?action=sitematrix&format=json&smtype=language&smstate=all&smlangprop=code%7Cname%7Csite%7Cdir%7Clocalname&smsiteprop=url%7Cdbname%7Ccode%7Csitename%7Clang&formatversion=2

That API returns a list of sites (i.e. wikis), each with the base language for that wiki. And then we have special code for inserting the variants zh-hans and zh-hant for the case of zhwiki.

It might be cleaner, and much more complete, to use the siteinfo API:

https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&meta=siteinfo&formatversion=2&siprop=languages%7Clanguagevariants

This gives us a list of language codes (instead of site codes) and their corresponding names (which are actually more correct than the names given by sitematrix). The names are the local language names by default, but can be given in any language, e.g. English:

https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&format=json&meta=siteinfo&formatversion=2&siprop=languages%7Clanguagevariants&siinlanguagecode=en

And furthermore, this API gives us the full list of variants, not just for Chinese but a large number of other languages. Certainly we could combine this API with the current sitematrix call to see which languages map to which wikis.

Note:
Useful deep links collection for testing language variants.
https://etherpad.wikimedia.org/p/DeepLinkForTests

Test wiki for testing in-app links
https://test.wikipedia.org/wiki/Language_Variants

Event Timeline

Dbrant created this task.Sep 21 2020, 3:31 PM
Restricted Application added subscribers: Cosine02, Aklapper. · View Herald TranscriptSep 21 2020, 3:31 PM
Dbrant updated the task description. (Show Details)Sep 21 2020, 3:32 PM
Charlotte triaged this task as Medium priority.Sep 22 2020, 4:25 PM
Charlotte set the point value for this task to 4.
cooltey updated the task description. (Show Details)Oct 13 2020, 8:47 PM
cooltey updated the task description. (Show Details)

Looks like the Kazakh (kk) does not show the article in the correct language variant, maybe it is a mobile-html or mediawiki API issue.
https://kk.wikipedia.org/wiki/Үйде_жалғыз_қалғанда

https://kk.wikipedia.org/api/rest_v1/page/mobile-html/Үйде_жалғыз_қалғанда (default: kk-kz)
https://kk.wikipedia.org/api/rest_v1/page/mobile-html/Үйде_жалғыз_қалғанда with Accept-Language: kk-latn, it only changes the article title.
https://kk.wikipedia.org/api/rest_v1/page/mobile-html/Үйде_жалғыз_қалғанда with Accept-Language: kk-arab, it only changes the article title too.

@Mholloway @MSantos @bearND Any thoughts? Please let me know if I need to create a ticket for this issue, thanks.

LGoto reassigned this task from cooltey to MSantos.Oct 14 2020, 3:42 PM
LGoto added a subscriber: cooltey.
Mholloway added a comment.EditedOct 14 2020, 3:48 PM

PCS language variant handing currently appears only to handle zh (Chinese) and sr (Serbian) variants (and only the general zh-hans and zh-hant variants in the case of Chinese). See lib/wikiLanguage.js and lib/wikiLanguageMapping.json. We'd be better off grabbing the full language variant info from siteinfo (just as @Dbrant suggests doing here in the Android app) rather than working from a hard-coded list.

Let's open a separate task for the proposed Page Content Service update.

cooltey updated the task description. (Show Details)Oct 15 2020, 9:48 PM
cooltey claimed this task.Oct 15 2020, 9:58 PM

Let's open a separate task for the proposed Page Content Service update.

Sure! Thanks @Mholloway

Hi @MSantos, I assigned you to this separate task for the proposed update: T265687

ABorbaWMF added a subscriber: ABorbaWMF.

Tested all the links on the etherpad above and they all worked except for the ones that were noted as 'not working'. Tested on 2.7.50333-alpha-2020-10-22