Page MenuHomePhabricator

Add a key to Z60 for the Wikidata item for a language
Open, LowPublicFeature

Description

Feature summary (what you would like to be able to do and where):

An instance of Z60 should contain not only a language code (Z60K1, most likely compliant with bcp47) and aliases thereof (Z60K2), but an identifier for the Wikidata item for that language (as Z60K3 most likely).

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):

Keys into multilingual Z12/texts and Z32/stringsets on Wikifunctions are represented by ZObject identifiers, but the indirect mapping of such identifiers to Wikidata items via language codes is prone to error in many circumstances, due not just to differences in the availability of language codes in different parts of the Wikimedia universe but also to how those language codes are used between those parts where those codes are available. It therefore would alleviate such errors to make a mapping from language objects to Wikidata items explicit within those objects.

There are some specific scenarios where a set of language objects sharing the same set of Wikidata lexemes would be assisted by having a Wikidata item key, such that a reference to the same item on multiple language objects would help establish a link between them:

  • different script variants of a language, which may have divergent language codes (e.g. Z1657/pa and Z1083/pnb for Punjabi);
  • different romanization standards for a language (e.g. those represented by the Wikidata items Q559173 and Q56929 for Z1221/nan, in addition to Z1647/nan-hani); and
  • different regional variants for a language (e.g. Z1003/es, Z1127/es-150, Z1547/es-419, and Z1133/es-mx).

Benefits (why should this be implemented?):

Functions that operate with respect to particular languages can, for example, choose lexemes based on the Wikidata item field and choose representations on those lexemes' forms based on the code field.

(tfsl currently deals with languages as code-item pairs, and code in Ninai and Udiron uses the existence of these pairs extensively.)

Event Timeline

Consider making this a Z2 key so that any persistent object can reference a corresponding Wikidata item.

This would effectively be a secondary identity key, which we'd previously hand-waved-for-now and said we weren't yet supporting. Not sure of the implications in our orchestrator code, if any.

Consider making this a Z2 key so that any persistent object can reference a corresponding Wikidata item.

I think we'd rather implement this via identity keys with specific semantic intent and a one-to-one mapping, rather than a catch-all "roughly the same as". For instance, Q32043 ("addition") would be the target of many Functions – one that takes two Integers, one that takes five, one that takes two floats, one that takes a vector of hypercomplex numbers, etc..

@Jdforrester-WMF Hmmm… “specific semantic intent” is good, but a link at the Z2 level is a semantic copula. It specifically asserts identity, not “roughly the same as”. Accordingly, it is much more likely to be appropriate for types and identities than for functions, implementations and tests.

I’m not sure I recall the specific hand-waving around secondary identities. In my view, Wikidata is the natural repository for these, so I would expect support for them to grow out of Wikidata integration. However, I consider Wikidata identifiers to be a special case, not least because their status as secondary identifiers is debatable at the WMF level (although they are clearly secondary outside of Wikidata, in the physical sense).

The Z60 case is a special special (special?) case, of course… but one way or another, I continue to agree with @Mahir256.

As part of a backlog review, we’re looking at tasks in the Product Backlog that haven’t had activity in the past 12 months to help keep priorities clear and up to date.

We plan to move this task to “No Current Plans” after 15 May 2026. If you believe this task is relevant right now, please comment and provide context on why before then.

Many thanks!

In Abstract Wikipedia, text that is not in the target language should be accompanied by a reference to the alternative language, for example: as a link, as text or as a footnote. Without a link from the Natural language to its Wikidata item, none of these options is viable. Z29749 is one of the most widely used functions on Abstract Wikipedia (called from over 400 pages but prefixing the Latin script language tag (when required) produces unsatisfactory rendered content in any language. The same approach is also adopted by the Z11 display function, Z21583, leading to unsatisfactory results for any embedded function that returns Monolingual text, when the result is not in the requested language.