Page MenuHomePhabricator

[Community] Ideas to support a smooth change to `mul`
Closed, ResolvedPublic

Description

Situation:
We plan to implement the new language code mul on Wikidata mainly to reduce the number of redundant Items (main task: T285156, next milestones: T312097). Fewer Labels in Items might have unintended consequences for some tools. Some consequences seem unavoidable (e.g. the number of Labels will be a less useful indicator for ranking Items) but we want to ensure that the change is as light as possible on users and maintainers of tools.

Community Input:
This task collects and discusses community input about

  • what this change could mean for tool developers, data reusers, and editors
  • ideas what we can do to make the change as easy as possible

Outcomes:

  • We need to make sure that tool developers are aware of the change.
    • e.g. scripts shouldn't add the same label multiple times if an identical "mul" value is already there
  • Tools that use wbgetentities: Should just work using languagefallback=1 (see T312723#8214779).
  • Screen readers: We use <span lang="mul"> to indicate text in mul language code. This is common practice on the Wikimedia projects and seemed satisfactory in our tests. An alternative would have been to omit this completely. We will continuously monitor feedback on this.
  • Wikidata Query Service (WDQS): Should just work after some tweaks (see T304976)

Original:
There were questions about this initiative at the Wikidata Data Quality Days 2022.

Event Timeline

Suscribing JeanFred for inteGraality

Hi @VIGNERON and @JeanFred, great to meet you at the Data Quality Days! \o/

I am glad that you brought mul up at the event as I am currently working on this: Starting next week or so we will fix a few issues with our current prototype on Test (version 0.2, see T312097). You mentioned that https://www.wikidata.org/wiki/Wikidata:Tools/inteGraality or any tool related to labels might suffer from a switch to mul. Could you explain a bit more? What would be the best way to discuss this?

Another comment: how will the Wikidata Query Service (WDQS) handle mul ?

For instance, could it be integrated in the [AUTO_LANGUAGE] of the SERVICE wikibase:label ? (right now, this is only the interface language of WDQS but it would be useful and interresting to also integrate mul, and it will make it more discoverable for WDQS user)

Hi @VIGNERON, yes! I already did some digging on WDQS, and fortunately, it seems that there is a good solution: T304976: MUL - Update WDQS documentation to include mul labels

For tools / reusers that use wbgetentities with languages= and languagefallback=1, getting (mul) labels should “just work” – example:

https://test.wikidata.org/w/api.php?action=wbgetentities&format=json&ids=Q57591&props=labels&languages=en&languagefallback=1

{
    "entities": {
        "Q57591": {
            "type": "item",
            "id": "Q57591",
            "labels": {
                "en": {
                    "value": "John Doe II",
                    "language": "mul",
                    "for-language": "en"
                }
            }
        }
    },
    "success": 1
}

I.e., if an en label is removed in favor of a mul label, the label would still appear in the API response.

Manuel renamed this task from Investigate how to make the new mul labels as easy on existing tools as possible to Investigate how to make the change to the new mul labels easy on users, reusers and tools developers.Sep 12 2022, 7:22 AM
Manuel updated the task description. (Show Details)
Manuel renamed this task from Investigate how to make the change to the new mul labels easy on users, reusers and tools developers to [Community Input] How can we make the change to the new `mul` labels as easy as possible.Feb 14 2023, 12:43 PM
Manuel updated the task description. (Show Details)
Manuel renamed this task from [Community Input] How can we make the change to the new `mul` labels as easy as possible to [Community] Ideas to support a smooth change to `mul`.Feb 15 2023, 6:24 PM