Page MenuHomePhabricator

evaluate creation of en-us for Wikidata monolingual strings
Open, LowPublic


Hello @GerardM,
I transmit below the answer from the demander, MPF (who can't answer on Phabricator for technical reasons:
"American (en-us) is absolutely not the same as standard English (en). To say so is disgusting cultural imperialism, and highly objectional and insulting to English people; as such it is wholly contrary to the ethos and standards of wikis. The language name and code must follow the original language, i.e., English (en) is the language of English people in England, and not the often very different variant spoken in USA. Compare for example Portuguese (pt), which is used for the original European Portuguese (10.5 million speakers), while Brazilian Portuguese is pt-br despite its much greater population size (200 million) - a very comparable case to en and en-us.
As with the example cited earlier by Fralambert for fr-ca, many plants and animals have different names in English and American, and to insist that the standard English name is the American name - and that the actual English name is therefore incorrect - is highly insulting to English people.
To have an 'en-gb' is also inaccurate, as - although generally extremely similar - there are slight differences between English English, and Scottish English, Welsh English, Manx English, etc."

(request copied from T151186 )

See also:
T33874: Preferences and lang codes should distinguish "English" from "American English"/"U.S. English"

Event Timeline

Esc3300 created this task.Jan 4 2017, 4:08 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 4 2017, 4:08 PM
Izno added a subscriber: Izno.EditedJan 4 2017, 5:35 PM

en is en-ambiguous. en-gb is "standard English" in that it is English spoken by persons living in England and the other countries in Great Britain. User appears to have a battleground mentality on what he believes English is and what the rest of the world thinks English is.

And in fact, appears to be edit-warring on wiki, which suggests to me that administrative action may be necessary therein.

This discussion may be relevant to the request to add en-us; I like the suggestion to make it so that we can "tag" the aliases of relevance with a particular English variation. (I might suggest that would be low priority work.)

Esc3300 added a comment.EditedJan 4 2017, 6:52 PM

It might be worth keeping in mind the difference (or the absence of a difference) between the following:

  1. language code for monolingual string
  2. language code for labels/descriptions
  3. language code for interface
  4. language code with fallback to another variant of the same language

Hmm, why is this assigned to Lea_Lacroix_WMDE ?
This sounds like it has already been answered in T151186#2916448. If not, it sounds like something for the language committee instead.

Esc3300 added a comment.EditedJan 4 2017, 7:18 PM

She does Community communications on WikiData dev stuff. There is an outline of the process at . Apparently she doesn't think it's answered by that and needs some feedback for the community. In any case, it's not about fr-ca.

Indeed, I transmitted the request, but the language committee should give an advice about this.

Esc3300 renamed this task from create en-us to evalute creation of en-us for Wikidata monolingual strings.Jan 16 2017, 12:06 PM
Esc3300 updated the task description. (Show Details)
Esc3300 renamed this task from evalute creation of en-us for Wikidata monolingual strings to evaluate creation of en-us for Wikidata monolingual strings.Jan 16 2017, 12:16 PM
Lydia_Pintscher triaged this task as Low priority.Jun 11 2017, 5:23 PM

@GerardM @jhsoby Can one of you give the official opinion of the language committee on this request?

Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptAug 3 2017, 5:56 PM
Psychoslave added a comment.EditedNov 12 2017, 10:08 AM

I replied to which is linking here as follow:

While I share the concern about the symptom, I'm not sure it's really a technical problem, but more an editorial one about accepted language granularity. That is, the problem surely go far beyond en-US, as I would expect that even within US you will find regional linguistic variations. Actually, even in small town you might find notable dialectal variations. So surely recognize en-US would be fine, but this wouldn't solve the underlying problem.

To give a more concrete example and take some distance with the American English concern which highly mix the linguistic granularity concern with the hegemony concern, I propose to look at the case of Alemannic. It gathers a lot of problematic that arise when trying to categorise discourse practices under the abstract notion of language.

First, the English Wikipedia article presents it as a group of dialects of the Upper German branch of the Germanic language family. ISO 639-3 distinguishes four languages for this group: gsw (Swiss German), swg (Swabian German), wae (Walser German) and gct (Alemán Coloniero, spoken since 1843 in Venezuela).

Additionally, Alsatian dialect is clearly a part of this group. Under the ISO 639-3 segmentation it is coded under gsw. There is a [Alemannic version of Wikipedia, but hosted under the als subdomain](, which obviously come from Alsatian.

Finally, a dialect of Alsatian German is spoken in Amish communities of the United States and Canada.

I'm curious to know how the language committee deals with that kind of problem, and eager to read any documentation you might point me to regarding this topic, and processes used by the language committee.

The Alsatian issues precedes the language policy and as such it is a "fait
accompli". We have to live with it. A similar situation will no longer come
into being because it will not be approved.


@GerardM So this task should also be declined?

Jc86035 added a subscriber: Jc86035.Dec 4 2018, 3:15 PM

@Liuxinyu970226 I don't think so; GerardM didn't clearly indicate that, and presumably "similar situation" would refer to new wikis for macrolanguages, of which English isn't one.

The situation with en-us is different to the situation for Alemannic, and the granularity issue is different for Wikidata because it's a multilingual project to begin with. (I also think the LangCom decision – wherever it is, because I can't find it – was wrong, but that's not why I think the task shouldn't be declined.)

en-US is perfectly legitimate. The stuff about Alsatian is not.