Page MenuHomePhabricator

Enable all ISO 639-3 codes on Wikidata
Open, Needs TriagePublicFeature

Description

Feature summary (what you would like to be able to do):
Enable all ISO 639-3 codes on Wikidata, and also Glottolog codes if possible. Preferably, we can do this for all of Wikidata, but Wikidata:Lexicographical data is a good place to start.
Wikidata currently supports only a few hundred language codes, whereas there are about 7,000 languages spoken in the world.
This creates a situation where editors from regions such as Africa and many parts of Asia cannot add their languages to Wikidata without having to resort to the wildcard code [mis].

A list of ISO 639-3 codes imported from Etnologue (22nd edition) can be found at:
https://en.wiktionary.org/wiki/Appendix:ISO_639-3_codes

This list of codes can be imported.

Glottolog codes can be imported from:
https://glottolog.org/glottolog/language

Glottolog codes include dialects, which ISO does not support, and includes many recently described languages that have no ISO codes.

Steps to reproduce (a list of clear steps to create the situation that made you report this, including full links if applicable):

See below.

Use case(s) (describe the actual underlying problem which you want to solve, and not only a solution):

Creating Wikidata items for the world's 6,000+ little-known languages currently very clunky and discourages users from contributing data in those languages.

I tried creating a lexeme for bàbò (Nupe (Q36720) for Lagenaria siceraria (Q1277255)), but technical restrictions prevented me from doing so at first, since the ISO code [nup] is not currently supported by Wikidata. Initially, I also could not add the Nupe name to Lagenaria siceraria (Q1277255), since Wikidata items cannot be linked to Incubator pages. In the future, I would like to add lexemes for dozens of African languages that do yet have any officially launched wikis, but it appears that Wikidata cannot yet support this.

At Wikidata talk:Lexicographical data, So9q said that lexemes for which we do not yet have selectable language codes can be given "mis" as language code. He created bàbò (L585993) as a test. The template for the "create new lexeme"-page could be improved.

For example, see a list of fish and plant species names in the Day language. The goal is to enable Day, or any other language with an ISO code, to be added without having to resort to [mis].
https://en.wikipedia.org/wiki/Day_language