Page MenuHomePhabricator

Change $wgLexemeLanguageCodePropertyId from Wikidata mapping P218 to P305
Open, Needs TriagePublic

Assigned To
None
Authored By
Tarrow
Wed, Jan 21, 11:51 AM
Referenced Files
F72073534: image.png
Fri, Feb 13, 5:30 PM
F72071128: image.png
Fri, Feb 13, 5:30 PM
F72071162: image.png
Fri, Feb 13, 5:30 PM
F72070799: image.png
Fri, Feb 13, 5:30 PM
F72070854: image.png
Fri, Feb 13, 5:30 PM
F72070588: image.png
Fri, Feb 13, 5:30 PM
F72070882: image.png
Fri, Feb 13, 5:30 PM
F72069811: image.png
Fri, Feb 13, 4:46 PM

Description

Background

Wikibase has a special property $wgLexemeLanguageCodePropertyId that facilitates entry of Lemmas.
Normally when entering a Lemma, a user would have to choose a language Item and then enter a language code manually.

image.png (285×790 px, 11 KB)
image.png (433×450 px, 18 KB)

But if the language item has a statement with $wgLexemeLanguageCodePropertyId, then the code will be inferred from the value of that statement (1 step less).

image.png (403×388 px, 12 KB)
image.png (344×458 px, 15 KB)

The language codes are validated against the list that is configured separately (see at your own risk)

If a language Item has a language code statement assigned with a value that is not on the list, the user will still have to enter a meaningful language code manually.

image.png (347×372 px, 12 KB)
image.png (453×452 px, 22 KB)

Even more background

On Wikidata, this special property $wgLexemeLanguageCodePropertyId used to be configured as P218, but was switched to P305 in October 2023 T348923: Switch Property that we use for determining available language codes.
The change was intended to make sure people use IETF language tag instead of ISO 639-1 codes ensuring larger coverage. This change is only meaningful from the point of view of data modelling - it doesn't seem to affect UX in anyway, since the codes are eventually validated against a separately configured list anyways.

On Wikibase Cloud, the users must still use the code P218 in their manifest to enable this functionality (and this is also prescribed in our docs.
https://github.com/wbstack/mediawiki/blob/9070baab128ffdb7d08dd791b088d27286aa92c5/dist-persist/wbstack/src/Settings/LocalSettings.php#L675

This creates inconsistency with Wikidata (users from Wikdiata who are used to work with Lexemes using P305 are now expected to map their properties to P218.

Task

We see the following ways to proceed:

  • Do nothing
    • This would leave Cloud inconsistent with Wikidata
  • Update the expected property code from P218 to P305
    • The users who already mapped their property and created statements in the language Items based on it will have to remap the property and rewrite all those language statements.
    • Optionally, we can try to write a script that makes the update.
  • Allow both properties to be mapped to Wikidata
    • Creates a slight inconsistency with Wikidata, but we avoid affecting the existing users.

Currently there are 9 non-deleted instances that configured this special property and mapped it to P218.

image.png (218×602 px, 24 KB)

SELECT
  ws.wiki_id,
  w.domain, 
  JSON_UNQUOTE(JSON_EXTRACT(ws.value, '$.properties.P218')) AS p218_mapped_to
FROM wiki_settings ws
JOIN wikis w ON w.id = ws.wiki_id
WHERE ws.name = 'wikibaseManifestEquivEntities'
  AND ws.value LIKE '%"P218"%'
  AND w.deleted_at IS NULL;

Only 4 of them used it in a statement, with the total of 15 statements.
https://furry.wikibase.cloud/ - 1 statement
https://riga-literata.wikibase.cloud/ - 6 statements
https://data.r74n.com/ - 6 statements
https://memory-prime.wikibase.cloud - 2 statements

Based on this, we believe that we should update the code and inform the owners of 4 instances that they should update their mapping and statements.

Acceptance Criteria

  • The variable $wgLexemeLanguageCodePropertyId is assigned to P305 in LocalSettings.php of all Cloud instances.
  • The FAQ documentation is updated.
  • We reached out to the owners of the affected instances and informed them about the change.

Event Timeline

Tarrow renamed this task from Change $wgLexemeLanguageCodePropertyId from P218 to P305 to Change $wgLexemeLanguageCodePropertyId from Wikidata mapping P218 to P305.Mon, Jan 26, 1:50 PM

There are 10 non-deleted instances that have a P218 mapping.

image.png (259×610 px, 26 KB)

https://furry.wikibase.cloud/ - 1 statement
https://riga-literata.wikibase.cloud/ - 6 statements
https://data.r74n.com/ - 6 statements
https://enote-playground.wikibase.cloud - 0 statements
https://bodhitestwiki.wikibase.cloud/ - 0 statements
https://fuzzy-sl.wikibase.cloud/ - 0 statements
https://ontolagoon.wikibase.cloud - 0 statements
https://memory-prime.wikibase.cloud - 2 statements
https://recordsincontexts.wikibase.cloud - 0 statements

Example of the SPARQL query I used to find this info:

PREFIX wdt:  <https://recordsincontexts.wikibase.cloud/prop/direct/>
PREFIX p:    <https://recordsincontexts.wikibase.cloud/prop/>
PREFIX ps:   <https://recordsincontexts.wikibase.cloud/prop/statement/>

SELECT ?item ?value ?statement WHERE {
  ?item wdt:P275 ?value .
  ?item p:P275 ?statement .
  ?statement ps:P275 ?value .
}