Page MenuHomePhabricator

Rename the language codes sr-ec and sr-el to the BCP 47 conform codes sr-Cyrl and sr-Latn
Open, Needs TriagePublic

Description

The language codes sr-ec (Serbian in Cyrillic script) and sr-el (Serbian in Latin script) are not conform to BCP 47. BCP 47 explicit declares sr-Cyrl and sr-Latn as language codes.

The internal language codes can be in lower case: sr-cyrl and sr-latn. LanguageCode::bcp47() creates the capital letters.

The current language codes should exist as alias for compatibility.

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
mediawiki/coremaster+954 -893
mediawiki/coremaster+0 -0
mediawiki/coremaster+104 -47
mediawiki/coremaster+902 -1
mediawiki/coremaster+45 -89
mediawiki/coremaster+12 -913
mediawiki/extensions/Translatemaster+6 -6
mediawiki/extensions/UniversalLanguageSelectormaster+12 -12
mediawiki/extensions/DiscussionToolsmaster+16 -16
mediawiki/extensions/Citemaster+0 -0
mediawiki/coremaster+194 -44
mediawiki/extensions/InputBoxmaster+0 -0
mediawiki/extensions/ParserFunctionsmaster+2 -2
mediawiki/extensions/MobileFrontendmaster+8 -8
mediawiki/extensions/WikimediaMessagesmaster+4 -2
mediawiki/extensions/Wikibasemaster+3 -3
mediawiki/extensions/Gadgetsmaster+4 -4
mediawiki/extensions/CategoryTreemaster+4 -4
mediawiki/extensions/Echomaster+2 -2
mediawiki/extensions/LiquidThreadsmaster+4 -4
mediawiki/extensions/FlaggedRevsmaster+2 -2
mediawiki/extensions/Thanksmaster+2 -2
mediawiki/extensions/cldrmaster+13 -13
operations/deployment-chartsmaster+2 -0
mediawiki/extensions/InputBoxmaster+4 -4
mediawiki/extensions/Wikibasemaster+16 -0
mediawiki/coremaster+21 -10
operations/puppetproduction+6 -0
operations/mediawiki-configmaster+2 -0
translatewikimaster+25 -21
mediawiki/extensions/ExtJSBasemaster+2 -2
operations/mediawiki-configmaster+14 -0
mediawiki/extensions/Wikibasemaster+10 -6
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Fomafix renamed this task from Rename the language codes sr-el and sr-ec to the BCP 47 conform codes sr-Latn and sr-Cyrl to Rename the language codes sr-ec and sr-el to the BCP 47 conform codes sr-Cyrl and sr-Latn.Jul 25 2019, 5:37 AM
Fomafix updated the task description. (Show Details)

Change 368248 had a related patch set uploaded (by Fomafix; owner: Fomafix):
[operations/puppet@production] Add additional aliases for sr-cyrl and sr-latn next to sr-ec and sr-el

https://gerrit.wikimedia.org/r/368248

As part of T174601: Change the language codes of Sakizaya from "ais" (retired by SIL) to "szy" everywhere, and add it to Names.php we now have tooling to move messages from one language code to another, both in translatewiki.net and in the repositories.

This is not sufficient however. Great deal should be taken to setup proper language fallbacks, making sure the variant URLs still work, being clear what language codes will be used, figuring out migrations in WikiData, etc.

This all should be explored and have a plan written in the task summary. Feedback should be requested from relevant parties, it might be worth to go through RfC or similar process to make sure that it happens.

Change 522144 abandoned by Nikerabbit:
[translatewiki@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

Reason:
This requires a migration plan.

https://gerrit.wikimedia.org/r/522144

Is there a way to define sr-cyrl and sr-latn as alias for sr-ec and sr-el ?

Change 521053 abandoned by Addshore:

[mediawiki/extensions/Wikibase@master] Restore tests with language codes 'sr-cyrl' and 'sr-latn'

Reason:

Abandoning all Wikibase.git patches that have not been touched since 2019.
If you want to revive this patch, or keep it around for your reference please re open it!
We just got to below 100 open changes for Wikibase.git!

https://gerrit.wikimedia.org/r/521053

Change 287141 abandoned by Addshore:

[mediawiki/extensions/Wikibase@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

Reason:

Abandoning all Wikibase.git patches that have not been touched since 2019.
If you want to revive this patch, or keep it around for your reference please re open it!
We just got to below 100 open changes for Wikibase.git!

https://gerrit.wikimedia.org/r/287141

Change 375616 had a related patch set uploaded (by Fomafix; author: Fomafix):

[operations/mediawiki-config@master] Add language codes sr-cyrl and sr-latn next to sr-ec and sr-el

https://gerrit.wikimedia.org/r/375616

Change 522354 abandoned by Jdlrobson:

[mediawiki/extensions/MobileFrontend@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

Reason:

Clearing out the review queue for 2022. Please restore if the core patch gets renewed traction.

https://gerrit.wikimedia.org/r/522354

Change 790357 had a related patch set uploaded (by Fomafix; author: Fomafix):

[operations/deployment-charts@master] Add additional aliases for sr-cyrl and sr-latn next to sr-ec and sr-el

https://gerrit.wikimedia.org/r/790357

Change 467770 merged by jenkins-bot:

[mediawiki/core@master] Step 1 of renaming sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/467770

Change 522354 restored by Fomafix:

[mediawiki/extensions/MobileFrontend@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/522354

Change 287141 restored by Fomafix:

[mediawiki/extensions/Wikibase@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/287141

Change 816224 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/extensions/Cite@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/816224

Change 824729 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/core@master] Add 'sr-cyrl' and 'sr-latn' to Names.php

https://gerrit.wikimedia.org/r/824729

Change 521053 restored by Fomafix:

[mediawiki/extensions/Wikibase@master] Restore tests with language codes 'sr-cyrl' and 'sr-latn'

https://gerrit.wikimedia.org/r/521053

Change 827520 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/extensions/Translate@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/827520

Change 828511 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/core@master] Remove support for messages with the language code sr-ec and sr-el

https://gerrit.wikimedia.org/r/828511

Change 829259 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/core@master] Rename sr-ec.json to sr-cyrl.json and sr-el.json to sr-latn.json

https://gerrit.wikimedia.org/r/829259

Change 830658 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/core@master] Support sr-cyrl and sr-latn in language converter

https://gerrit.wikimedia.org/r/830658

+1 for language code to conform syntax of language, script, region codes to BCP 47--script codes can be in 4-letter Alpha in sentence form or numeric; region codes in 2-letter ALPHA in capital letters.

Examples for Chinese language (Mandarin, Min Nan and Cantonese (Yue), scripts in Traditional and Romanized scripts.
zh-nan-Hant-TW (Chinese spoken in Min Nan, written in Traditional script in Taiwan)
zh-Hant-HK (Chinese written in Traditional script in Hong Kong, assuming in Mandarin language)
yue-HK (Chinese spoken in Cantonese in Hong Kong)
cmn-Latn (Chinese Mandarin written in Romanized form, (not necessarily Pinyin), ie. a Romanization for Qianlong Emperor: "Tchien Lung Whang Tee" (from a 1797 publication)

validator: https://schneegans.de/lv/

Thank you.

zh-nan-Hant-TW (Chinese spoken in Min Nan, written in Traditional script in Taiwan)

Based on the IETF BCP 47 language tag specification, it's recommended to use nan-Hant-TW instead of zh-nan-Hant-TW (nan-* instead of zh-nan-*, same for yue-Hant-HK vs. zh-yue-Hant-HK) .

Perfect. Thanks for the updates.

Change 521000 merged by jenkins-bot:

[mediawiki/extensions/InputBox@master] Rename language codes sr-ec and sr-el to sr-Cyrl and sr-Latn in tests

https://gerrit.wikimedia.org/r/521000

I think we were waiting for TranslateWiki to rename the codes? How is that going?

Is there a task for that? It's not in our radar currently. We haven't renamed language codes in many years, so the process is unclear. We have some docs in https://translatewiki.net/wiki/Renaming_language_codes

Change 421793 merged by jenkins-bot:

[mediawiki/extensions/cldr@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/421793

Change 421793 merged by jenkins-bot:

[mediawiki/extensions/cldr@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/421793

I assume removing sr-ec and sr-el here is why sr-ec and sr-el no longer have an English name - T348366

Change 323667 merged by jenkins-bot:

[mediawiki/extensions/CategoryTree@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/323667

Change 323667 merged by jenkins-bot:

[mediawiki/extensions/CategoryTree@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/323667

Translation pages have been renamed on translatewiki.net .

Change 323667 merged by jenkins-bot:

[mediawiki/extensions/CategoryTree@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/323667

Translation pages have been renamed on translatewiki.net .

Change also shoud be made on Translatewiki, messages are still made with old codes by default

Change also should be made on translatewiki, messages are still made with old codes by default.

I'm afraid that there's no way to partially disable language codes by projects on translatewiki for now.

I'm afraid that there's no way to partially disable language codes by projects on translatewiki for now.

Any thoughts about this? Like disabling sr-* language codes on translatewiki until complete the JSON file renaming process (but I think that won't be a good idea, though).

@Nikerabbit
@jhsoby

I'm afraid that there's no way to partially disable language codes by projects on translatewiki for now.

Actually, there is:

$wgTranslateDisabledTargetLanguages = [
	'*' => [...], // languages disabled in all groups
	'ext-categorytree' => ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
	'ext-categorytree-api' =>  ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
	'ext-categorytree-user' => ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
];

By the way, there is no i18n system for these messages, but I’d probably include both English and Serbian text in these strings to make them easier to understand for Serbian translators.

Actually, there is:

$wgTranslateDisabledTargetLanguages = [
	'*' => [...], // languages disabled in all groups
	'ext-categorytree' => ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
	'ext-categorytree-api' =>  ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
	'ext-categorytree-user' => ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
];

By the way, there is no i18n system for these messages, but I’d probably include both English and Serbian text in these strings to make them easier to understand for Serbian translators.

Serbian version: Молимо преводите на sr-cyrl. & Molimo prevodite na sr-latn.
Also messages translated through Special:Translate shoud be generated with new codes automatically
I hope i helped
Regards

Actually, there is:

$wgTranslateDisabledTargetLanguages = [
	'*' => [...], // languages disabled in all groups
	'ext-categorytree' => ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
	'ext-categorytree-api' =>  ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
	'ext-categorytree-user' => ['sr-el' => 'Translate to sr-latn please.', 'sr-ec' => 'Translate to sr-cyrl please.'],
];

Oh, good to know.

How would it work?

  • ['*'] + ['ext-categorytree'] for 'ext-categorytree' => how to enable sr-cyrl and sr-latn ?
  • ['ext-categorytree'] for 'ext-categorytree' => duplicate other disabled language codes?

By the way, there is no i18n system for these messages, but I’d probably include both English and Serbian text in these strings to make them easier to understand for Serbian translators.

Yeah, pretty sad.

How would it work?

  • ['*'] + ['ext-categorytree'] for 'ext-categorytree' => how to enable sr-cyrl and sr-latn ?

Oh, I didn’t realize these cannot be translated to. It turns out that although they’re not disabled (disabled languages are defined here: rGTWN mw-config/TranslateSettings.php:163-227 (at 11774909d725)), MediaWiki core “helpfully” translates sr-cyrl to sr-ec and sr-latn to sr-el, making them unavailable in Special:Translate because it normalizes language codes using the LanguageFactory:

$languageFactory = MediaWiki\MediaWikiServices::getInstance()->getLanguageFactory();
$languageFactory->getLanguage( 'sr-cyrl' )->getCode() === 'sr-ec'
$languageFactory->getLanguage( 'sr-latn' )->getCode() === 'sr-el'

Editing a single message, on the other hand, doesn’t do such normalization, and thus seems to work (I successfully saved a null edit; I didn’t want to experiment with real changes). Truly disabled languages, like als, give a permission error when trying to edit a single message.

  • ['ext-categorytree'] for 'ext-categorytree' => duplicate other disabled language codes?

There is no need to duplicate other language codes. Languages can be disabled at three levels:

  • Globally, on all of translatewiki.net
  • For a given root message group (the root is the part before the first - in the message group ID)
  • For a given single message group

The different levels are merged automatically. So while ext-categorytree, ext-categorytree-api and ext-categorytree-user all need to be specified because ext-categorytree doesn’t count as a root message group, languages disabled at a global or root ext level don’t need to be repeated.

Serbian version: Молимо преводите на sr-cyrl. & Molimo prevodite na sr-latn.

Thanks!

We got complaints from translatewiki.net: https://translatewiki.net/wiki/Support#c-Milicevic01-20231212102100-MediaWiki:Cldr-desc/sr-cyrl

Why was https://gerrit.wikimedia.org/r/c/mediawiki/extensions/cldr/+/421793 merged? It seems premature to me.

See https://codesearch.wmcloud.org/search/?q=%5C%2Bes-419&files=&excludeFiles=&repos= for the preferred way of configuring languages per group. But note that sr-cyrl is not enabled on translatewiki.net so nobody can translate using that code.

When will this be fully implemented and available on Wikipedia, for instance? Will URLs also be affected? I hope so. I'm asking because of another project that uses Wikidata as its source, which, I believe, still uses sr-ec and sr-el.

Change #522354 abandoned by Jdlrobson:

[mediawiki/extensions/MobileFrontend@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

Reason:

Hello this is an automated message.
I am abandoning this patch as it over a year old, and is not currently in a mergeable state. This has nothing to do with the quality of the patch.

If you still care about this patch, please feel free to restore it and rebase it, and we can happily continue the conversation to help you get it merged.

https://gerrit.wikimedia.org/r/522354

Change #522354 restored by Fomafix:

[mediawiki/extensions/MobileFrontend@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/522354

Change #481318 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMessages@master] Step 1 of renaming sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/481318

Change #522354 merged by jenkins-bot:

[mediawiki/extensions/MobileFrontend@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/522354

Change #323683 merged by jenkins-bot:

[mediawiki/extensions/ParserFunctions@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/323683

Change #1051680 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/core@master] Update jquery.i18n from 1.0.7 to 1.0.10

https://gerrit.wikimedia.org/r/1051680

Change #1051680 merged by jenkins-bot:

[mediawiki/core@master] Update jquery.i18n from 1.0.7 to 1.0.10

https://gerrit.wikimedia.org/r/1051680

Change #816224 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/816224

Change #1239301 had a related patch set uploaded (by Fomafix; author: Fomafix):

[mediawiki/extensions/DiscussionTools@master] Rename language codes 'sr-ec' and 'sr-el' to 'sr-cyrl' and 'sr-latn'

https://gerrit.wikimedia.org/r/1239301

Change #1239301 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Rename language codes 'sr-ec' and 'sr-el' to 'sr-cyrl' and 'sr-latn'

https://gerrit.wikimedia.org/r/1239301

Change #522186 abandoned by Nikerabbit:

[mediawiki/extensions/UniversalLanguageSelector@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

Reason:

We have a process for renaming language codes. Better to use that one.

https://gerrit.wikimedia.org/r/522186

Change #827520 abandoned by Nikerabbit:

[mediawiki/extensions/Translate@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/827520

Based on the recent changes, does this mean the task won't be implemented, or will it be covered as part of another issue?

I think and hope that the codes are going to be renamed, in the present task, just via other means.

Change #415161 abandoned by Hashar:

[mediawiki/core@master] Add language codes for Serbian with Ekavian pronunciation

https://gerrit.wikimedia.org/r/415161

Change #824729 abandoned by Hashar:

[mediawiki/core@master] Add 'sr-cyrl' and 'sr-latn' to additional places

https://gerrit.wikimedia.org/r/824729

Change #828511 abandoned by Hashar:

[mediawiki/core@master] Remove support for messages with the language codes sr-ec and sr-el

https://gerrit.wikimedia.org/r/828511

Change #829259 abandoned by Hashar:

[mediawiki/core@master] Rename sr-ec.json to sr-cyrl.json and sr-el.json to sr-latn.json

https://gerrit.wikimedia.org/r/829259

Change #830658 abandoned by Hashar:

[mediawiki/core@master] Support sr-cyrl and sr-latn in language converter

https://gerrit.wikimedia.org/r/830658

Change #251312 abandoned by Hashar:

[mediawiki/core@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/251312

Change #828511 restored by Thcipriani:

[mediawiki/core@master] Remove support for messages with the language codes sr-ec and sr-el

https://gerrit.wikimedia.org/r/828511

Change #251312 restored by Thcipriani:

[mediawiki/core@master] Rename language codes sr-ec and sr-el to sr-cyrl and sr-latn

https://gerrit.wikimedia.org/r/251312

Change #824729 restored by Thcipriani:

[mediawiki/core@master] Add 'sr-cyrl' and 'sr-latn' to additional places

https://gerrit.wikimedia.org/r/824729

Change #830658 restored by Thcipriani:

[mediawiki/core@master] Support sr-cyrl and sr-latn in language converter

https://gerrit.wikimedia.org/r/830658

Change #829259 restored by Thcipriani:

[mediawiki/core@master] Rename sr-ec.json to sr-cyrl.json and sr-el.json to sr-latn.json

https://gerrit.wikimedia.org/r/829259

Change #415161 restored by Thcipriani:

[mediawiki/core@master] Add language codes for Serbian with Ekavian pronunciation

https://gerrit.wikimedia.org/r/415161