Add lexeme language codes sat-olck, sat-latn, sat-beng, sat-orya
Closed, ResolvedPublic
Actions

Description

This ticket is to add language codes for the representations of Santali lexemes and forms with ISO 15924 codes (lowercased) for Ol-chiki, Latin, Bengali and Oriya scripts, which are the 4 different scripts which are used for Santali language.

Details

	Subject	Repo	Branch	Lines +/-
	Add lexeme language codes sat-latn, sat-beng, sat-orya (Santali in Latin, Bengali and Oriya scripts)	mediawiki/extensions/WikibaseLexeme	master	+9 -0

Customize query in gerrit

Event Timeline

Bodhisattwa created this task.Sep 15 2020, 6:36 PM

Restricted Application added a project: Wikidata. · View Herald TranscriptSep 15 2020, 6:36 PM

Bodhisattwa moved this task from Backlog to Wikidata (lexemes) on the Language codes board.Sep 15 2020, 6:37 PM

According to https://en.wikipedia.org/wiki/Santali_language Ol-chiki is the official script, so sat-olck would be redundant (sat would always be sat-olck), but Iana doesn't tell to supress any script.
@Amire80 @jhsoby What's your opinion about adding these codes and should sat-olck be added or not?

While Ol-chiki is the script used in India, Santali community uses Bengali script in Bangladesh. And earlier books were written in Latin script by the British missionaries. https://en.wikipedia.org/wiki/Santali_Latin_alphabet

With the exception of sat-Olck (for the reasons @Mbch331 mentions) these all make sense to me, as long as there is use for them (and I trust @Bodhisattwa knows that best). So that's a "go" from me.

I'll add the codes except for sat-Olck. (@Bodhisattwa: Default script only means that omitting the script from the language code means the default script applies, so sat = sat-Olck.)

Change 633556 had a related patch set uploaded (by Mbch331; owner: Mbch331):
[mediawiki/extensions/WikibaseLexeme@master] Add lexeme language codes sat-latn, sat-beng, sat-orya (Santali in Latin, Bengali and Oriya scripts)

https://gerrit.wikimedia.org/r/633556

gerritbot added a project: Patch-For-Review.Oct 12 2020, 3:04 PM

I must agree with Bodhi here that having a code for sat-olck is still necessary as it is not guaranteed that Santali speakers outside of India will be able to read it. "Official" in India need not mean "official" in the other countries in which it is spoken, as a closer read of the article on the language should indicate. Besides, we already have separate language codes for a particular language and the scripts in which it is written, including the "default" (such as kk and kk-arab, kk-cyrl, kk-latn, or iu and ike-cans, ike-latn, and similarly for ks, ku, tg, and ug) so I don't see a problem with continuing this trend in the interest of preventing ambiguity.

@jhsoby Can you respond to Mahir256?

Mbch331 added a project: Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)).Oct 13 2020, 7:53 AM

Lydia_Pintscher moved this task from To Do (prioritised from top to bottom) to Peer Review on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Oct 13 2020, 7:54 AM

In T262967#6536922, @Mahir256 wrote:

I must agree with Bodhi here that having a code for sat-olck is still necessary as it is not guaranteed that Santali speakers outside of India will be able to read it. "Official" in India need not mean "official" in the other countries in which it is spoken, as a closer read of the article on the language should indicate. Besides, we already have separate language codes for a particular language and the scripts in which it is written, including the "default" (such as kk and kk-arab, kk-cyrl, kk-latn, or iu and ike-cans, ike-latn, and similarly for ks, ku, tg, and ug) so I don't see a problem with continuing this trend in the interest of preventing ambiguity.

If I understand correctly, Ol Chiki is used as the default script for anything Santali (in Wikimedia projects). So if we add sat-olck we will essentially have two different language codes (sat and sat-olck) that cover the exact same thing. The rest make sense since they're different from the default, but as long as there is a default script for a language (in our context), it doesn't make sense to me to add the language code with the script specified.

Change 633556 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add lexeme language codes sat-latn, sat-beng, sat-orya (Santali in Latin, Bengali and Oriya scripts)

https://gerrit.wikimedia.org/r/633556

Maintenance_bot removed a project: Patch-For-Review.Oct 13 2020, 10:10 AM

ItamarWMDE moved this task from Peer Review to Test (Verification) on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Oct 13 2020, 10:16 AM

ReleaseTaggerBot added a project: MW-1.36-notes (1.36.0-wmf.14; 2020-10-20).Oct 13 2020, 11:00 AM

Thanks everyone! This should go out next week :)

Add lexeme language codes sat-olck, sat-latn, sat-beng, sat-oryaClosed, ResolvedPublicActions

Description

Details

Event Timeline

Add lexeme language codes sat-olck, sat-latn, sat-beng, sat-orya
Closed, ResolvedPublic
Actions