[Story] Mono-lingual text datatype should support "no linguistic content" and "undetermined language"
Closed, ResolvedPublic

Description

It should be possible to add language = no linguistic content (ISO 639-2 : "zxx") and language = undetermined " (ISO 639-2 : "und").

Usecase property:P1684 for inscriptions on artworks or ancient artefacts

Details

Reference
bz70205
bzimport raised the priority of this task from to Normal.
bzimport set Reference to bz70205.
bzimport added a subscriber: Unknown Object (MLST).
bzimport created this task.Aug 30 2014, 9:40 AM

Can you please link to a specific example? Thanks!

Zoloinwiki wrote:

Yes, for "unknown language", not that many, but for example:
https://www.wikidata.org/wiki/Q13534364 inscription: 'ΒΟΥΗΛΑ·ΖΟΑΠΑΝ·ΤΕΣΗ·ΔΥΓΕΤΟΙΓΗ·ΒΟΥΤΑΟΥΛ·ΖΩΑΠΑΝ·ΤΑΓΡΟΓΗ·ΗΤΖΙΓΗ·ΤΑΙΣΗ', language: unknown. (qualifier:writing system: Greek alphabet)

For "no linguistic content" that would be short texts like signatures or inventory numbers that may have significance for art historians but are not really in any language, like https://commons.wikimedia.org/wiki/File:Egon_Schiele_060.jpg : inscription : 'S10' (writing system: latin alphabet)

Snipre added a subscriber: Snipre.Nov 29 2014, 11:59 AM
Lydia_Pintscher removed a subscriber: Unknown Object (MLST).
daniel added a comment.Dec 5 2014, 2:32 PM

For "no linguistic content", the "string" data type seems like a better match. But I suppose the intention here is to always use the same property to represent the inscription text, no matter whether it's natural language or not (what about symbols though? QR codes? pictograms?..)

In general, it should be possible to configure any additional languages the admin of a wikibase install likes. We have to think about how the names for such languages get localized, and how we handle things like directionality marks. Also, handling the language list as such is not a trivial challenge if it gets big.

If we can discuss about the necessity to have no linguistic content (ISO 639-2 : "zxx) which can be described with the string datatype, we need the undetermined " (ISO 639-2 : "und").

Could you add the two other special ISO 639-2 codes : "mul" (when a string contains multiples languages) and maybe "mis" too ?

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 27 2015, 5:51 AM
Jonas renamed this task from Mono-lingual text datatype should support "no linguistic content" and "undetermined language" to [Story] Mono-lingual text datatype should support "no linguistic content" and "undetermined language".Sep 10 2015, 7:39 PM
Jonas set Security to None.
adrianheine updated the task description. (Show Details)Dec 7 2015, 1:47 PM
adrianheine claimed this task.
adrianheine removed a project: Need-volunteer.

Change 257316 had a related patch set uploaded (by Adrian Lang):
Allow special language codes in monolingual text values

https://gerrit.wikimedia.org/r/257316

Change 257316 merged by jenkins-bot:
Allow special language codes in monolingual text values

https://gerrit.wikimedia.org/r/257316

adrianheine closed this task as Resolved.Jan 26 2016, 9:44 AM

und, mis, mul and zxx are now available at Wikidata.org. They are not present in the suggester, but if you enter them in the input field and save, they are recognized.

Apparently, a fix around this did not get deployed, so the suggester might not even allow these four language codes yet. Should get better next week.