Page MenuHomePhabricator

Add monolingual language code en-in (Indian English)
Closed, DeclinedPublic

Description

Please add en-in as "Indian English".

For context on this dialect see
https://www.wikidata.org/wiki/Q1348800
https://en.wikipedia.org/wiki/Indian_English

For comparison consider that Phabricator (or whatever controls the Wikidata language process) has already logged
British English (en-gb)
Canadian English (en-ca)

Indian English is, of course, more different and spoken by many more people than these other two. I expect that adding en-in is noncontroversial in the context of these other two existing already.

My objective for this is to get Indian English text noted as such in Wikidata language property fields. I asked about this at the Wikidata Project Chat at
https://www.wikidata.org/w/index.php?title=Wikidata:Project_chat&oldid=815344802#Which_property_is_the_origin_of_automated_language_checks?
and the answer was to request here on Phabricator.

Event Timeline

Bluerasberry updated the task description. (Show Details)
Bluerasberry updated the task description. (Show Details)
Bluerasberry updated the task description. (Show Details)

Hi, please explain where / to which software to add this (as this task has no project associated). Is this about MediaWiki?

Is this about https://translatewiki.net/wiki/Translatewiki.net_languages#Language_variants ?

@Aklapper

Here is a screenshot showing where this language list should appear in Wikidata.
https://commons.wikimedia.org/wiki/File:Language_list_for_Wikidata.png

My objective is making "Indian English" an option in Wikidata drop down menus which ask "what language is this text". I asked at Wikidata project chat and was directed that Phabricator manages this process.

Here are some Phabricator tickets which seem to be trying to manage the same issue
https://phabricator.wikimedia.org/T137808
https://phabricator.wikimedia.org/T155417
https://phabricator.wikimedia.org/T155425

Am I in the correct place to address this issue?

Ammarpad changed the task status from Open to Stalled.Sep 23 2019, 6:27 AM
Ammarpad removed Ammarpad as the assignee of this task.
Ammarpad triaged this task as Lowest priority.
Ammarpad added a subscriber: Ammarpad.

How to get feedback from LangCom on this task? Are there any contact workflows that the task author should follow?

How to get feedback from LangCom on this task? Are there any contact workflows that the task author should follow?

@jhsoby & @Amire80 are members of LangCom and can speak on behalf of LangCom. Jon/Amir: Can you respond to the request?

Are there any examples of where the values in "English" and in "Indian English" would be different?

Are there any examples of where the values in "English" and in "Indian English" would be different?

While referring to a movie theatre, in Indian English it normally would be called a cinema hall.
In elections, Americans will mention voting blocs while Indians talk about vote banks.

Then you have British-like spellings for things like "metre", "organisation", or "colour".

With its unique terms and seperate spelling conventions, I would certainly say this is more than justified.

What system message would be different in Indian English compared to British English?

Hoi,
No problems with Indian English, we also support Australian English. I can
imagine that there are differences in Indian English but that takes nothing
away from the legitimacy of this request.
Thanks,

GerardM

Are there any examples of where the values in "English" and in "Indian English" would be different?

While referring to a movie theatre, in Indian English it normally would be called a cinema hall.

This sounds more like something for labels and not for monolingual values, although I wonder whether it's really necessary to define a whole language for this.

In elections, Americans will mention voting blocs while Indians talk about vote banks.

There are two English Wikipedia articles, and two corresponding Wikidata items. It's not a thing for which I'd define a different language.

Then you have British-like spellings for things like "metre", "organisation", or "colour".

With its unique terms and seperate spelling conventions, I would certainly say this is more than justified.

Sure, but none of these are specific examples of something that can't be done on Wikidata or another Wikimedia project at the moment without defining a new language.

To be clear, I'm not opposed to adding it in principle, but it is relevant to understand what is this useful for exactly.

If MediaWiki messages can be different in a way that can't be covered by en (or en-gb), then we can have it as a user interface language similarly to en-gb, en-ca, etc., and if I'm not mistaken, this will automatically apply also to monolingual codes.

If different MediaWiki messages are not necessary, but monolingual values are, that's fine, but I'd like to see specific examples.

If MediaWiki messages can be different in a way that can't be covered by en (or en-gb), then we can have it as a user interface language similarly to en-gb, en-ca, etc., and if I'm not mistaken, this will automatically apply also to monolingual codes.

I would prefer to see that outcome in all honesty. Indian English has millions of speakers worldwide and is much more popular than British or Canadian English. I never really understood why users shouldn't be allowed to use it for their interface language setting.

If MediaWiki messages can be different in a way that can't be covered by en (or en-gb), then we can have it as a user interface language similarly to en-gb, en-ca, etc., and if I'm not mistaken, this will automatically apply also to monolingual codes.

I would prefer to see that outcome in all honesty. Indian English has millions of speakers worldwide and is much more popular than British or Canadian English. I never really understood why users shouldn't be allowed to use it for their interface language setting.

Sure, I am happy to do it. If we have en-ca, en-in is legitimate, too. But as I said, I'd first love to see examples of messages that can't be covered by other variants.

Wikidata has lexemes in addition to items. Lexemes need language codes to express to which language a lexeme belongs. To create lexemes that tell us that vote banks is a term in en-in that corresponds to vote block in en-us we need lexemes.

Wikidata has lexemes in addition to items. Lexemes need language codes to express to which language a lexeme belongs. To create lexemes that tell us that vote banks is a term in en-in that corresponds to vote block in en-us we need lexemes.

This ticket appears to be for monolingual text, not lexemes.

And for lexemes, I wouldn't use en-us or en-in for your example at all, because they are not spelling variants. I would enter both as English with the properties location of sense usage and synonym on the senses.

For monolingual text, are there any reasons why this shouldn't move ahead? If later someone translates the interface as well, this shouldn't create any problems.

@Bluerasberry as you requested this, what's your view on this?

What system message would be different in Indian English compared to British English?

Can we have en-in for labels/descriptions without having it as an interface language?

What system message would be different in Indian English compared to British English?

Can we have en-in for labels/descriptions without having it as an interface language?

Yes. It's possible to add languages for labels/aliases/descriptions that aren't used for the interface. But there needs to be clear distinction between en and en-in. We don't want en-in so we can copy en.

I have said before at T201509#4488401: all setting of language codes should be unified to one place outside Wikibase.

Also we need to differ "interface language" and "languages known by MediaWiki" - the former only including languages with localization enabled, and the later (which Wikidata term/monolingual/lexeme languages should bases on) would ideally include all 7000+ languages in IANA registry.