Page MenuHomePhabricator

Many more languages need to be added to Multilingual Wikisource (mul.ws)
Open, Needs TriagePublic

Description

mul.ws (OldWikisource) now has translation administrator capabilities and we have many languages that we host but we cannot define the pages as being in that language using page info. E.g. I cannot define https://wikisource.org/w/index.php?title=Main_Page/Dagbanli&action=info as being in dag. Please add these language codes--including mis, mul, und, and zxx.

Event Timeline

How to "define"? Where to add this?

jhsoby added subscribers: Amire80, Nikerabbit.

Well, changing page language is part of the Translate extension, so I think someone from Language-Team could tell us that. I'm not sure if changing it in language-data is enough – I added a few languages there six months ago, but those are still not available to choose as page languages, so there must be something more to do. Maybe @Nikerabbit or @Amire80 could shed some light on the matter?

Looked some more into it, and it seems that the list of languages you can choose corresponds directly to the list of languages in Names.php.

If there is a standard Names.php deployed to most WMF projects, it would need to be greatly expanded at mul.ws--that is the only project that is intended to host multilingual content (with Commons, Data, and Species being presumably language-agnostic repositories) and that also includes very small languages that will never have content elsewhere. In fact, some of the content there may not be linguistic at all, so we need to have the four special codes listed above and possibly even the private use area to define if that's an option.

Koavf updated the task description. (Show Details)
Amire80 renamed this task from Many more languages need to be added to mul.ws to Many more languages need to be added to Multilingual Wikisource (mul.ws).Aug 26 2018, 8:51 AM

See T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki.

In the meantime, these could possibly be added with local wiki configuration using $wgExtraLanguageNames.

List of languages with valid language codes and at least some content in oldwikisource:

Not in Names.php:

  • aat Arvanitika – Αρbε̰ρίσ̈τε
  • ae Avestan – (no native name)
  • akk Akkadian – 𒀝𒅗𒁺𒌑
  • alq Algonquin – Anicinapemiȣin
  • arp Arapaho – Hinónoʼeitíít
  • bal Balochi – بلوچی
  • bem Bemba – Chibemba
  • cnr Montenegrin – Crnogorski
  • cop Coptic – Ϯⲁⲥⲡⲓ ̀ⲛⲣⲉⲙ̀ⲛⲬⲏⲙⲓ
  • cpx Puxian Min – Pó-sing-gṳ̂
  • egy Egyptian – (no native name in Unicode?)
  • gmh Middle High German – Diutsch
  • jje Jeju – 제주말
  • kok Konkani – (several scripts)
    • kok-deva Konkani – कोंकणी
    • kok-latn Konkani – Kōṅkaṇī
    • kok-knda Konkani – ಕೊಂಕಣಿ
  • ist Istriot – (no native name?)
  • kca Khanty – ханты ясаң
  • lld Ladin – Ladin
  • lra Laraʼ – Laraʼ
  • mas Maasai – Ɔl Maa
  • mfe Mauritian Creole – kreol morisien
  • mnc Manchu – ᠮᠠᠨᠵᡠ ᡤᡳᠰᡠᠨ
  • mnp Northern Min – Mâing-bă̤-ngṳ̌
  • non Old Norse – Norrœnt
  • nrn Norn – Norn
  • osx Old Saxon – Sahsisk
  • ota Ottoman Turkish – عثمانليجه
  • pau Palauan – Palau
  • peo Old Persian – (no native name?)
  • pox Polabian – Wenska rec
  • rml Baltic Romani – Романы
  • ruo Istro-Romanian – Rumârește
  • ryu Okinawan – うちなーぐち
  • see Seneca – Onödowága
  • sjk Kemi Sami – (no native name?)
  • slr Salar – Salırça
  • suk Sukuma – Kɪsukuma
  • syc Syriac – ܠܫܢܐ ܣܘܪܝܝܐ‎
  • tpn Tupi – Tupynã'mbá
  • txg Tangut – 𗼇𗟲
  • wym Wymysorys – Wymysöryś
  • yrk Nenets – Ненэцяʼ

Already in Names.php:

  • ab Abkhaz
  • ady Adyghe
  • af Afrikaans
  • an Aragonese
  • arn Mapudungun
  • ast Asturian
  • azb South Azerbaijani
  • ba Bashkir
  • bm Bambara
  • bo Standard Tibetan
  • cdo Min Dong Chinese
  • chr Cherokee
  • co Corsican
  • csb Kashubian
  • cu Church Slavonic
  • cv Chuvash
  • diq Zazaki
  • dsb Lower Sorbian
  • ext Extremaduran
  • fy West Frisian
  • ga Irish
  • gag Gagauz
  • gd Scottish Gaelic
  • gn Guaraní
  • got Gothic
  • grc Ancent Greek
  • hak Hakka Chinese
  • haw Hawaiian
  • hi Hindi
  • hsb Upper Sorbian
  • ht Haitian
  • ia Interlingua
  • io Ido
  • iu Inuktitut
  • jam Jamaican Creole
  • jbo Lojban
  • jv Javanese
  • ka Georgian
  • kk Kazakh
  • km Khmer
  • koi Komi-Permyak
  • krl Karelian
  • ku Kurdish
  • kw Cornish
  • ky Kyrgyz
  • lad Judeo-Spanish
  • lb Luxembourgish
  • lg Luganda
  • lij Ligurian
  • liv Livonian
  • lmo Lombard
  • ln Lingala
  • lv Latvian
  • lzh Literary Chinese
  • mdf Moksha
  • mg Malagasy
  • mh Marshallese
  • mhr Meadow Mari
  • mi Maori
  • mn Mongol
  • mrj Hill Mari
  • ms Malay
  • mwl Mirandese
  • my Burmese
  • myv Erzya
  • nah Nahuatl
  • nap Neapolitan
  • nds Low German
  • ne Nepali
  • ng Ndonga
  • nv Navajo
  • oc Occitan
  • olo Livvi-Karelian
  • om Oromo
  • pa Punjabi
  • pcd Picard
  • pdt Plautdietsch
  • pi Pali
  • pnb Western Punjabi
  • pnt Pontic
  • qu Quechua
  • rm Romansh
  • rup Aromanian
  • ruq Megleno-Romanian
  • sc Sardinian
  • scn Sicilian
  • se Northern Sami
  • si Sinhala
  • sm Samoan
  • sn Shona
  • sq Albanian
  • stq Saterland Frisian
  • su Sundanese
  • sw Swahili
  • tet Tetun
  • tg Tajik
  • tk Turkmen
  • tl Tagalog
  • tt Tatar
  • ty Tahitian
  • tyv Tuvan
  • udm Udmurt
  • ug Uyghur
  • ur Urdu
  • uz Uzbek
  • vep Veps
  • vo Volapük
  • wa Walloon
  • xh Xhosa
  • xmf Mingrelian
  • yue Yue
  • zu Zulu

List of languages with valid language codes and at least some content in oldwikisource:

Additionally, we have Chono and Bolak, which would be under mis and there are several multilingual sources, which are mul. This is why we need the four special codes as well as the private area to define... Thanks for this list.

@jhsoby:

  • nah Nahuatl

I strongly oppose you to call it as "valid language", because per SIL, the nah is a Collective code for a group of Nahuatl languages.

@jhsoby:

  • nah Nahuatl

I strongly oppose you to call it as "valid language", because per SIL, the nah is a Collective code for a group of Nahuatl languages.

Yes, I'm aware of that. However, for MediaWiki it is technically valid even though it shouldn't be.

This query currently finds 17. It's based on my additions at https://www.wikidata.org/wiki/Q18198097#P407 (that can be incomplete). Maybe it's worthing going through them and check which ones could be added.

This seems to be a neverending task. Accordingly I added Tracking-Neverending . For actually needed codes, separate tickets (grouped or individual) would be preferable.

The problem with that list is that it includes languages without any actual content (maybe arp was deleted since) and when we get to the end of it, https://wikisource.org/wiki/Category:Languages could be including others. The 17 from the query might be a worthwhile sub-task.

Tacsipacsi subscribed.

Since the request is to add the languages to Special:PageLanguage, adding the languages doesn’t make sense without the form actually displaying them, i.e. without T320884 resolved.