Page MenuHomePhabricator

[Story] Add a new datatype for multilingual text
Open, MediumPublic


We need to provide a way to save the same statement in multiple languages. Examples for its usage are : the motto of a country that has several primary languages, usage notes and comments (Property:P2315) that should be supplied with multiple languages and fallback.

Related Objects

Event Timeline

Lydia_Pintscher raised the priority of this task from to Medium.
Lydia_Pintscher updated the task description. (Show Details)
Lydia_Pintscher added a project: Wikidata.
Lydia_Pintscher set Security to None.
Lydia_Pintscher renamed this task from new datatype multilingual text to add a new datatype for multilingual text.Mar 5 2015, 11:57 AM
Jonas renamed this task from add a new datatype for multilingual text to [Story] Add a new datatype for multilingual text.Aug 13 2015, 2:54 PM
Jonas removed a project: Epic.

Is there still a point in having this? We could merely improve display of various languages for properties with multiple monolingual strings.

Clearly yes, if we have two sets of monolingual strings in different languages, we should want to know which set a individual string belongs to.
Multilingual text should be used if:

  • Texts in different languages are translation each other, and only one and any one is needs, like labels and descriptions (e.g. P2559)
  • The text is naturely in different languages and only at most one per language, or if there're more than one per language, the value can easily be split to sets (e.g. P2275)

Multilingual text should not be used if:

  • The text is in only one language (e.g. P1922)
  • There may be more than one text in one language, and texts in different languages have no direct relationship each other (e.g. P1843)

I agree about P1922 and P1843.

P2559 (Wikidata usage instructions) seems to be mainly a GUI problem with descriptions. I suppose that could be solved with a new datatype.

Not sure about P2275.

Perhaps the set problem could be solved with qualifiers, too.

Another possible use case: multiple sitelinks to the same site. For example, Wikidata may have a project page for postal codes in English, and someone wants a similar page in German. But the community does not want direct translation, because English page may want to cover the whole world, while the corresponding German page may only want to concentrate on the aspects specific to Germany.

In short, we end up with multiple pages about the same subject, just like Wikipedia, that should have "sitelinks", but all pages live on the same wiki, and all would like to share the same wikibase item.

P.S. (tangent) TBH, I am not a big fan of the language-based content division because languages do not map well to territories, but this may be the reality of that community, e.g. if postal codes are similar in Austria / Swiz / ...

Note example of another workaround is (​inverse label item), which uses labels of some items to store multilingual text. This is really a bad hack as it is proned to many misuses.

Is there still a point in having this? We could merely improve display of various languages for properties with multiple monolingual strings.

The most important point against the current widely used workaround is: a set of monolingual texts is semantically different from a multilingual text.