Page MenuHomePhabricator

Move content of $wgExtraLanguageNames on Wikidata to default Terms languages
Closed, ResolvedPublic

Description

This means that the current "custom" extra languages we have on Wikidata and Commons will be available on every Wikibase as a Term language.

This is convenient for use because it means we should be able to render Terms (e.g in Lua) in languages that are not known by the rendering wiki.

This could also be useful in a federated properties context since if both Wikis are on the same version of Wikibase then we know that these extra languages will be available.

Event Timeline

Tarrow created this task.Aug 11 2020, 11:06 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 11 2020, 11:06 AM

Change 619456 had a related patch set uploaded (by Tarrow; owner: Tarrow):
[mediawiki/extensions/Wikibase@master] Add Wikidata's $wgExtraLanguageNames to Terms WikibaseContentLanguages

https://gerrit.wikimedia.org/r/619456

Change 619768 had a related patch set uploaded (by Tarrow; owner: Tarrow):
[operations/mediawiki-config@master] Remove wgExtraLanguageNames from beta wikidata

https://gerrit.wikimedia.org/r/619768

Change 619768 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove wgExtraLanguageNames from beta wikidata

https://gerrit.wikimedia.org/r/619768

@Samantha_Alipio_WMDE had a chat with @guergana.tzatchkova and I about this and form a 3rd party Wikibase perspective we're OK with generally making languages less configurable for the user and instead having a curated list within since it eases federation between Wikibases and importing data from one to the other.

Tarrow updated the task description. (Show Details)Aug 13 2020, 9:53 AM

Change 619456 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add Wikidata's $wgExtraLanguageNames to Terms WikibaseContentLanguages

https://gerrit.wikimedia.org/r/619456

guergana.tzatchkova added a comment.EditedAug 13 2020, 2:52 PM

Doing this move resolves in a slight change in behaviour for users trying to view the page in the extra language. Since the language is not not known by MediaWiki the interface falls back to English, which in the case of fkv is 'Kvensk' rather than the autonym 'kvääni'.

test (status quo):

https://wikidata.beta.wmflabs.org/wiki/Q563207?uselang=fkv
beta (future state):
https://test.wikidata.org/wiki/Q212690?uselang=fkv

We spoke to @Lydia_Pintscher and she said this is probably acceptable since we don't support fkv as an interface language but just the Terms.

Change 620050 had a related patch set uploaded (by Guergana Tzatchkova; owner: Guergana Tzatchkova):
[operations/mediawiki-config@master] Remove $wgExtraLanguageNames from Wikidata and Commons

https://gerrit.wikimedia.org/r/620050

This will be unstalled once 1.36.0-wmf.5 is on wikidata and commons. Then the config patch can be merged

Michael added a subscriber: Michael.

As far as I can tell, this should now be good to go.

Summary of this task for people who write patches to add new language codes somewhere on Wikidata (Language codes):

  • Language codes for terms (labels, descriptions, aliases) should now be added in WikibaseContentLanguages.php, getDefaultTermsLanguages() method. (They used to be added in InitialiseSettings.php, $wmgExtraLanguageNames variable.)
  • Language codes for monolingual text values should be added in WikibaseContentLanguages.php, getDefaultMonolingualTextLanguages() method. (No change.)
  • Language codes for lexicographical terms (lexeme lemmas or form representations) should be added in WikibaseLexeme.mediawiki-services.php, $additionalLanguages variable. (No change.)

@Mbch331 Please see Lucas' comment for a change relevant to you.

Change 620050 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove $wgExtraLanguageNames from Wikidata and Commons

https://gerrit.wikimedia.org/r/620050

Mentioned in SAL (#wikimedia-operations) [2020-09-23T09:55:44Z] <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:620050|Remove $wgExtraLanguageNames from Wikidata and Commons (T260118)]], part 1/2 (duration: 01m 16s)

Mentioned in SAL (#wikimedia-operations) [2020-09-23T09:57:04Z] <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:620050|Remove $wgExtraLanguageNames from Wikidata and Commons (T260118)]], part 2/2 (production no-op) (duration: 01m 04s)

@Mbch331 Please see Lucas' comment for a change relevant to you.

Thx for the heads up. So far I've done nothing with Language codes for terms, and the other 2 don't change. So nothing really changing for me.

Zache added a subscriber: Zache.Wed, Sep 30, 6:48 AM
Zache added a comment.EditedWed, Sep 30, 9:00 AM

In Wikimedia Commons is language selector of old wgExtraLanguageNames languages is now broken. Those languages cant be selected from the user interface. If I understand code correctly currently those languages are defined in wbTermsLanguages.json for CaptionsPanel.js.

Because the languages are missing there then they aren't included in Commons caption language selector. The same issue is in the SDC statements monolingual text language selector.

Example code for testing the content of the wbTermsLanguages in Commons

$( function() {
	mw.loader.using( 'wikibase.mediainfo.statements', function(require) {
	wbTermsLanguages = require( 'wikibase.mediainfo.statements' ).config.wbTermsLanguages;
	alert(JSON.stringify(wbTermsLanguages)); 
} );
});

And languages currently in Wikimedia Commons wbTermsLanguages variable.

{
	"aa": "Qafár af",
	"ab": "Аҧсшәа",
	"abs": "bahasa ambon",
	"ace": "Acèh",
	"ady": "адыгабзэ",
	"ady-cyrl": "адыгабзэ",
	"aeb": "تونسي/Tûnsî",
	"aeb-arab": "تونسي",
	"aeb-latn": "Tûnsî",
	"af": "Afrikaans",
	"ak": "Akan",
	"aln": "Gegë",
	"als": "Alemannisch",
	"alt": "тÿштÿк алтай тил",
	"am": "አማርኛ",
	"ami": "Pangcah",
	"an": "aragonés",
	"ang": "Ænglisc",
	"anp": "अङ्गिका",
	"ar": "العربية",
	"arc": "ܐܪܡܝܐ",
	"arn": "mapudungun",
	"arq": "جازايرية",
	"ary": "الدارجة",
	"arz": "مصرى",
	"as": "অসমীয়া",
	"ase": "American sign language",
	"ast": "asturianu",
	"atj": "Atikamekw",
	"av": "авар",
	"avk": "Kotava",
	"awa": "अवधी",
	"ay": "Aymar aru",
	"az": "azərbaycanca",
	"azb": "تۆرکجه",
	"ba": "башҡортса",
	"ban": "Basa Bali",
	"ban-bali": "ᬩᬲᬩᬮᬶ",
	"bar": "Boarisch",
	"bat-smg": "žemaitėška",
	"bbc": "Batak Toba",
	"bbc-latn": "Batak Toba",
	"bcc": "جهلسری بلوچی",
	"bcl": "Bikol Central",
	"be": "беларуская",
	"be-tarask": "беларуская (тарашкевіца)‎",
	"be-x-old": "беларуская (тарашкевіца)‎",
	"bg": "български",
	"bgn": "روچ کپتین بلوچی",
	"bh": "भोजपुरी",
	"bho": "भोजपुरी",
	"bi": "Bislama",
	"bjn": "Banjar",
	"bm": "bamanankan",
	"bn": "বাংলা",
	"bo": "བོད་ཡིག",
	"bpy": "বিষ্ণুপ্রিয়া মণিপুরী",
	"bqi": "بختیاری",
	"br": "brezhoneg",
	"brh": "Bráhuí",
	"bs": "bosanski",
	"btm": "Batak Mandailing",
	"bto": "Iriga Bicolano",
	"bug": "ᨅᨔ ᨕᨘᨁᨗ",
	"bxr": "буряад",
	"ca": "català",
	"cbk-zam": "Chavacano de Zamboanga",
	"cdo": "Mìng-dĕ̤ng-ngṳ̄",
	"ce": "нохчийн",
	"ceb": "Cebuano",
	"ch": "Chamoru",
	"cho": "Choctaw",
	"chr": "ᏣᎳᎩ",
	"chy": "Tsetsêhestâhese",
	"ckb": "کوردی",
	"co": "corsu",
	"cps": "Capiceño",
	"cr": "Nēhiyawēwin / ᓀᐦᐃᔭᐍᐏᐣ",
	"crh": "qırımtatarca",
	"crh-cyrl": "къырымтатарджа (Кирилл)‎",
	"crh-latn": "qırımtatarca (Latin)‎",
	"cs": "čeština",
	"csb": "kaszëbsczi",
	"cu": "словѣньскъ / ⰔⰎⰑⰂⰡⰐⰠⰔⰍⰟ",
	"cv": "Чӑвашла",
	"cy": "Cymraeg",
	"da": "dansk",
	"de": "Deutsch",
	"de-at": "Österreichisches Deutsch",
	"de-ch": "Schweizer Hochdeutsch",
	"de-formal": "Deutsch (Sie-Form)‎",
	"din": "Thuɔŋjäŋ",
	"diq": "Zazaki",
	"dsb": "dolnoserbski",
	"dtp": "Dusun Bundu-liwan",
	"dty": "डोटेली",
	"dv": "ދިވެހިބަސް",
	"dz": "ཇོང་ཁ",
	"ee": "eʋegbe",
	"egl": "Emiliàn",
	"el": "Ελληνικά",
	"eml": "emiliàn e rumagnòl",
	"en": "English",
	"en-ca": "Canadian English",
	"en-gb": "British English",
	"eo": "Esperanto",
	"es": "español",
	"es-419": "español de América Latina",
	"es-formal": "español (formal)‎",
	"et": "eesti",
	"eu": "euskara",
	"ext": "estremeñu",
	"fa": "فارسی",
	"ff": "Fulfulde",
	"fi": "suomi",
	"fit": "meänkieli",
	"fiu-vro": "Võro",
	"fj": "Na Vosa Vakaviti",
	"fo": "føroyskt",
	"fr": "français",
	"frc": "français cadien",
	"frp": "arpetan",
	"frr": "Nordfriisk",
	"fur": "furlan",
	"fy": "Frysk",
	"ga": "Gaeilge",
	"gag": "Gagauz",
	"gan": "贛語",
	"gan-hans": "赣语(简体)‎",
	"gan-hant": "贛語(繁體)‎",
	"gcr": "kriyòl gwiyannen",
	"gd": "Gàidhlig",
	"gl": "galego",
	"glk": "گیلکی",
	"gn": "Avañe'ẽ",
	"gom": "गोंयची कोंकणी / Gõychi Konknni",
	"gom-deva": "गोंयची कोंकणी",
	"gom-latn": "Gõychi Konknni",
	"gor": "Bahasa Hulontalo",
	"got": "𐌲𐌿𐍄𐌹𐍃𐌺",
	"grc": "Ἀρχαία ἑλληνικὴ",
	"gsw": "Alemannisch",
	"gu": "ગુજરાતી",
	"gv": "Gaelg",
	"ha": "Hausa",
	"hak": "客家語/Hak-kâ-ngî",
	"haw": "Hawaiʻi",
	"he": "עברית",
	"hi": "हिन्दी",
	"hif": "Fiji Hindi",
	"hif-latn": "Fiji Hindi",
	"hil": "Ilonggo",
	"ho": "Hiri Motu",
	"hr": "hrvatski",
	"hrx": "Hunsrik",
	"hsb": "hornjoserbsce",
	"ht": "Kreyòl ayisyen",
	"hu": "magyar",
	"hu-formal": "magyar (formal)‎",
	"hy": "հայերեն",
	"hyw": "Արեւմտահայերէն",
	"hz": "Otsiherero",
	"ia": "interlingua",
	"id": "Bahasa Indonesia",
	"ie": "Interlingue",
	"ig": "Igbo",
	"ii": "ꆇꉙ",
	"ik": "Iñupiak",
	"ike-cans": "ᐃᓄᒃᑎᑐᑦ",
	"ike-latn": "inuktitut",
	"ilo": "Ilokano",
	"inh": "ГӀалгӀай",
	"io": "Ido",
	"is": "íslenska",
	"it": "italiano",
	"iu": "ᐃᓄᒃᑎᑐᑦ/inuktitut",
	"ja": "日本語",
	"jam": "Patois",
	"jbo": "la .lojban.",
	"jut": "jysk",
	"jv": "Jawa",
	"ka": "ქართული",
	"kaa": "Qaraqalpaqsha",
	"kab": "Taqbaylit",
	"kbd": "Адыгэбзэ",
	"kbd-cyrl": "Адыгэбзэ",
	"kbp": "Kabɩyɛ",
	"kg": "Kongo",
	"khw": "کھوار",
	"ki": "Gĩkũyũ",
	"kiu": "Kırmancki",
	"kj": "Kwanyama",
	"kjp": "ဖၠုံလိက်",
	"kk": "қазақша",
	"kk-arab": "قازاقشا (تٴوتە)‏",
	"kk-cn": "قازاقشا (جۇنگو)‏",
	"kk-cyrl": "қазақша (кирил)‎",
	"kk-kz": "қазақша (Қазақстан)‎",
	"kk-latn": "qazaqşa (latın)‎",
	"kk-tr": "qazaqşa (Türkïya)‎",
	"kl": "kalaallisut",
	"km": "ភាសាខ្មែរ",
	"kn": "ಕನ್ನಡ",
	"ko": "한국어",
	"ko-kp": "조선말",
	"koi": "Перем Коми",
	"kr": "Kanuri",
	"krc": "къарачай-малкъар",
	"kri": "Krio",
	"krj": "Kinaray-a",
	"krl": "karjal",
	"ks": "कॉशुर / کٲشُر",
	"ks-arab": "کٲشُر",
	"ks-deva": "कॉशुर",
	"ksh": "Ripoarisch",
	"ku": "kurdî",
	"ku-arab": "كوردي (عەرەبی)‏",
	"ku-latn": "kurdî (latînî)‎",
	"kum": "къумукъ",
	"kv": "коми",
	"kw": "kernowek",
	"ky": "Кыргызча",
	"la": "Latina",
	"lad": "Ladino",
	"lb": "Lëtzebuergesch",
	"lbe": "лакку",
	"lez": "лезги",
	"lfn": "Lingua Franca Nova",
	"lg": "Luganda",
	"li": "Limburgs",
	"lij": "Ligure",
	"liv": "Līvõ kēļ",
	"lki": "لەکی",
	"lld": "Ladin",
	"lmo": "lumbaart",
	"ln": "lingála",
	"lo": "ລາວ",
	"loz": "Silozi",
	"lrc": "لۊری شومالی",
	"lt": "lietuvių",
	"ltg": "latgaļu",
	"lus": "Mizo ţawng",
	"luz": "لئری دوٙمینی",
	"lv": "latviešu",
	"lzh": "文言",
	"lzz": "Lazuri",
	"mai": "मैथिली",
	"map-bms": "Basa Banyumasan",
	"mdf": "мокшень",
	"mg": "Malagasy",
	"mh": "Ebon",
	"mhr": "олык марий",
	"mi": "Māori",
	"min": "Minangkabau",
	"mk": "македонски",
	"ml": "മലയാളം",
	"mn": "монгол",
	"mni": "ꯃꯤꯇꯩ ꯂꯣꯟ",
	"mnw": "ဘာသာ မန်",
	"mo": "молдовеняскэ",
	"mr": "मराठी",
	"mrh": "Mara",
	"mrj": "кырык мары",
	"ms": "Bahasa Melayu",
	"mt": "Malti",
	"mus": "Mvskoke",
	"mwl": "Mirandés",
	"my": "မြန်မာဘာသာ",
	"myv": "эрзянь",
	"mzn": "مازِرونی",
	"na": "Dorerin Naoero",
	"nah": "Nāhuatl",
	"nan": "Bân-lâm-gú",
	"nap": "Napulitano",
	"nb": "norsk bokmål",
	"nds": "Plattdüütsch",
	"nds-nl": "Nedersaksies",
	"ne": "नेपाली",
	"new": "नेपाल भाषा",
	"ng": "Oshiwambo",
	"niu": "Niuē",
	"nl": "Nederlands",
	"nl-informal": "Nederlands (informeel)‎",
	"nn": "norsk nynorsk",
	"no": "norsk",
	"nov": "Novial",
	"nqo": "ߒߞߏ",
	"nrm": "Nouormand",
	"nso": "Sesotho sa Leboa",
	"nv": "Diné bizaad",
	"ny": "Chi-Chewa",
	"nys": "Nyunga",
	"oc": "occitan",
	"olo": "Livvinkarjala",
	"om": "Oromoo",
	"or": "ଓଡ଼ିଆ",
	"os": "Ирон",
	"pa": "ਪੰਜਾਬੀ",
	"pag": "Pangasinan",
	"pam": "Kapampangan",
	"pap": "Papiamentu",
	"pcd": "Picard",
	"pdc": "Deitsch",
	"pdt": "Plautdietsch",
	"pfl": "Pälzisch",
	"pi": "पालि",
	"pih": "Norfuk / Pitkern",
	"pl": "polski",
	"pms": "Piemontèis",
	"pnb": "پنجابی",
	"pnt": "Ποντιακά",
	"prg": "Prūsiskan",
	"ps": "پښتو",
	"pt": "português",
	"pt-br": "português do Brasil",
	"qu": "Runa Simi",
	"qug": "Runa shimi",
	"rgn": "Rumagnôl",
	"rif": "Tarifit",
	"rm": "rumantsch",
	"rmy": "romani čhib",
	"rn": "Kirundi",
	"ro": "română",
	"roa-rup": "armãneashti",
	"roa-tara": "tarandíne",
	"ru": "русский",
	"rue": "русиньскый",
	"rup": "armãneashti",
	"ruq": "Vlăheşte",
	"ruq-cyrl": "Влахесте",
	"ruq-latn": "Vlăheşte",
	"rw": "Kinyarwanda",
	"sa": "संस्कृतम्",
	"sah": "саха тыла",
	"sat": "ᱥᱟᱱᱛᱟᱲᱤ",
	"sc": "sardu",
	"scn": "sicilianu",
	"sco": "Scots",
	"sd": "سنڌي",
	"sdc": "Sassaresu",
	"sdh": "کوردی خوارگ",
	"se": "davvisámegiella",
	"sei": "Cmique Itom",
	"ses": "Koyraboro Senni",
	"sg": "Sängö",
	"sgs": "žemaitėška",
	"sh": "srpskohrvatski / српскохрватски",
	"shi": "Tašlḥiyt/ⵜⴰⵛⵍⵃⵉⵜ",
	"shi-latn": "Tašlḥiyt",
	"shi-tfng": "ⵜⴰⵛⵍⵃⵉⵜ",
	"shn": "ၽႃႇသႃႇတႆး ",
	"shy-latn": "tacawit",
	"si": "සිංහල",
	"simple": "Simple English",
	"sk": "slovenčina",
	"skr": "سرائیکی",
	"skr-arab": "سرائیکی",
	"sl": "slovenščina",
	"sli": "Schläsch",
	"sm": "Gagana Samoa",
	"sma": "åarjelsaemien",
	"smn": "anarâškielâ",
	"sn": "chiShona",
	"so": "Soomaaliga",
	"sq": "shqip",
	"sr": "српски / srpski",
	"sr-ec": "српски (ћирилица)‎",
	"sr-el": "srpski (latinica)‎",
	"srn": "Sranantongo",
	"ss": "SiSwati",
	"st": "Sesotho",
	"stq": "Seeltersk",
	"sty": "себертатар",
	"su": "Sunda",
	"sv": "svenska",
	"sw": "Kiswahili",
	"szl": "ślůnski",
	"szy": "Sakizaya",
	"ta": "தமிழ்",
	"tay": "Tayal",
	"tcy": "ತುಳು",
	"te": "తెలుగు",
	"tet": "tetun",
	"tg": "тоҷикӣ",
	"tg-cyrl": "тоҷикӣ",
	"tg-latn": "tojikī",
	"th": "ไทย",
	"ti": "ትግርኛ",
	"tk": "Türkmençe",
	"tl": "Tagalog",
	"tly": "толышә зывон",
	"tn": "Setswana",
	"to": "lea faka-Tonga",
	"tpi": "Tok Pisin",
	"tr": "Türkçe",
	"tru": "Ṫuroyo",
	"trv": "Seediq",
	"ts": "Xitsonga",
	"tt": "татарча/tatarça",
	"tt-cyrl": "татарча",
	"tt-latn": "tatarça",
	"tum": "chiTumbuka",
	"tw": "Twi",
	"ty": "reo tahiti",
	"tyv": "тыва дыл",
	"tzm": "ⵜⴰⵎⴰⵣⵉⵖⵜ",
	"udm": "удмурт",
	"ug": "ئۇيغۇرچە / Uyghurche",
	"ug-arab": "ئۇيغۇرچە",
	"ug-latn": "Uyghurche",
	"uk": "українська",
	"ur": "اردو",
	"uz": "oʻzbekcha/ўзбекча",
	"uz-cyrl": "ўзбекча",
	"uz-latn": "oʻzbekcha",
	"ve": "Tshivenda",
	"vec": "vèneto",
	"vep": "vepsän kel’",
	"vi": "Tiếng Việt",
	"vls": "West-Vlams",
	"vmf": "Mainfränkisch",
	"vo": "Volapük",
	"vot": "Vaďďa",
	"vro": "Võro",
	"wa": "walon",
	"war": "Winaray",
	"wo": "Wolof",
	"wuu": "吴语",
	"xal": "хальмг",
	"xh": "isiXhosa",
	"xmf": "მარგალური",
	"xsy": "saisiyat",
	"yi": "ייִדיש",
	"yo": "Yorùbá",
	"yue": "粵語",
	"za": "Vahcuengh",
	"zea": "Zeêuws",
	"zgh": "ⵜⴰⵎⴰⵣⵉⵖⵜ ⵜⴰⵏⴰⵡⴰⵢⵜ",
	"zh": "中文",
	"zh-classical": "文言",
	"zh-cn": "中文(中国大陆)‎",
	"zh-hans": "中文(简体)‎",
	"zh-hant": "中文(繁體)‎",
	"zh-hk": "中文(香港)‎",
	"zh-min-nan": "Bân-lâm-gú",
	"zh-mo": "中文(澳門)‎",
	"zh-my": "中文(马来西亚)‎",
	"zh-sg": "中文(新加坡)‎",
	"zh-tw": "中文(台灣)‎",
	"zh-yue": "粵語",
	"zu": "isiZulu"
}
jhsoby added a subscriber: jhsoby.Wed, Sep 30, 10:33 AM

Change 631165 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Make sure monolingual text languages extend term languages

https://gerrit.wikimedia.org/r/631165

Change 631165 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Make sure monolingual text languages extend term languages

https://gerrit.wikimedia.org/r/631165

(Moved to dedicated task T264294: Cannot add monolingual text in Ottoman Turkish (ota) and some other languages.)

WMDE-leszek added a subscriber: WMDE-leszek.EditedWed, Oct 7, 3:46 PM

While this seems fairly confusing (I mean the unexpected consequences that have led to T264295 and T264294), it looks that task as it has been defined is done. Also the bug reported in T260118#6504302 has been fixed, I assume by the fix made actually for T264295.