Page MenuHomePhabricator

Include "special" languages in language selector for monolingual text
Closed, DuplicatePublic


The combobox selector shown when selecting a language for a monolingual text value does not list special languages like "mis" as supported. Entering the code blindly works, however. The selector should at least show the language codes, and if ideally also a name for these special languages (doesn't CLDR have names for these?).

Special language codes, according to

  • und: For content whose language is not yet determined (undetermined)
  • mis: For content whose language is known, but has no language code (uncoded languages); we also use it for content whose language has a language code, but is not yet available on
  • mul: For content in multiple languages (multiple languages)
  • zxx: For content that is not linguistic (no linguistic content, not applicable)

Event Timeline

@Micru that would be nice, but we currently don't have a place to put that information. We can "glue" it to the language code, e.g. mis-Q555555. But that would not be a valid ISO code, we would need to strip the suffix before using the code in RDF, etc. We would also need to create new UI widgets to be used for picking the item.

We are discussing the same issue for the representation of Lexicographical data, see T152019. I suppose we will have such "extended" language codes at some point, but it will take a while to get that right.

It's not just those special codes which are missing, any code which is only available for monolingual text is not shown (the list seems to be defined here).

CLDR only has names for some of the codes but I already requested local English names for the ones which don't display a name at all or are clearly using the native name in T151269.

I think adding a second way to require a language would not be addressing the cause of the problem: Monolingual text already requires a language code. Almost all uses of mis are to represent languages which do have valid codes but are not in the list of languages, i.e. people are using mis because we don't allow them to use the correct code. In my opinion, we don't need a way to require an item, we need a way to allow people to use valid codes.

@Micru: Do you have some examples? I haven't seen many missing qualifiers, this query only finds a small number missing a P2439 qualifier. 4 of those are languages/dialects (so the qualifier would link to the current item), 1 is on the sandbox item, 1 appears to be completely wrong (it's clearly English) and the last 2 are using a different property as the qualifier. That isn't all monolingual text properties (I can't find a way to query all of them that doesn't time out :( ) but it's the ones I think are most likely to use mis.

It seems that codes added for both labels and monolingual text (the ones defined here) also don't show up in the selector.