Page MenuHomePhabricator

Update mobileapps to use new Language variant codes
Closed, ResolvedPublic

Description

There is an issue found in the restbase test suite after some test failures:
It looks like language variants handling doesn't work properly on wikitext to mobile-html conversion.

This is the patch with the failing tests:
https://github.com/wikimedia/restbase/pull/1314

Event Timeline

Aklapper renamed this task from Languave variant handling in mobileapps is inconsistent with mw core variants to Language variant handling in mobileapps is inconsistent with mw core variants.Jan 19 2023, 9:07 AM

Change 853380 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] WIP: Use Bcp47Code when interfacing with Parsoid

https://gerrit.wikimedia.org/r/853380

Change 853380 merged by jenkins-bot:

[mediawiki/core@master] Use Bcp47Code when interfacing with Parsoid

https://gerrit.wikimedia.org/r/853380

cscott renamed this task from Language variant handling in mobileapps is inconsistent with mw core variants to Update mobileapps to use new Language variant codes.Apr 17 2023, 3:44 PM

@Jgiannelos , @cscott considering https://phabricator.wikimedia.org/T117845#8576044 - does this make a blocker for me to replace sr-el / sr-cr variant codes to sr-Latn / sr-Cyrl in mobileapps service?

The external interfaces (HTTP headers) already support sr-Latn and sr-Cyrl so T117845 is not a blocker -- T117845 is only for migrating the internal mediawiki codes.

Change 918550 had a related patch set uploaded (by Vadim Kovalenko; author: Vadim Kovalenko):

[mediawiki/services/mobileapps@master] Mobileapps: Update mobileapps to use new Language variant codes

https://gerrit.wikimedia.org/r/918550

I updated the language codes listed in lib/wikiLanguageMapping.json. As a guideline, I used ISO 15924 and ISO_3166-1.

  1. ISO_15924 - Subtags whose 'Type' field is 'script' (in other words, subtags defined by ISO 15924) MUST use titlecase. Example: Cyrl, Latn, Arab.
  2. ISO_3166-1 - Subtags whose 'Type' field is 'region' (in other words, the non-numeric region subtags defined by ISO 3166-1) MUST use all uppercase. Example: BR (for Brasil), CA (for Canada).

Check RFC5646#2.1.1 and A RFC5646#3.1.4 for more details.

Change 918550 merged by jenkins-bot:

[mediawiki/services/mobileapps@master] Mobileapps: Update mobileapps to use new Language variant codes

https://gerrit.wikimedia.org/r/918550

Update from a quick test in simulator in the beta cluster. Looks to me like backwards compatibility is working.

https://sr.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-html/BigPage

Visited this page on SR Wikipedia, sending sr-el in the Accept-Language header, and it correctly displays Latin characters. Sending sr-ec in header correctly displays Cyrillic characters.

sr-el

sr-el (1×575 px, 332 KB)

sr-ec

sr-ec (1×575 px, 328 KB)

https://zh.wikipedia.beta.wmflabs.org/api/rest_v1/page/mobile-html/%E6%B8%AF%E9%90%B5%E5%B8%82%E5%8D%80%E7%B6%AB%E9%9F%93%E8%A3%BD%E5%88%97%E8%BB%8A

Went to this page with both zh-tw and zh-hans in the Accept-Language header. A screenshot of both article views indicates the characters are changing in response.

zh-tw

zh-tw (1×575 px, 305 KB)

zh-hans

zh-hans (1×575 px, 302 KB)

Confirmed that it's working as expected for Android, too.

Change #1017420 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] Message::inLanguage(): accept Bcp47Code as well as string and Language

https://gerrit.wikimedia.org/r/1017420

Change #1017420 merged by jenkins-bot:

[mediawiki/core@master] Message::inLanguage(): accept Bcp47Code as well as string and Language

https://gerrit.wikimedia.org/r/1017420