Page MenuHomePhabricator

[GOAL] Add termbox language code mul to reduce redundancy in Wikidata Labels and Aliases
Closed, ResolvedPublicGoal

Description

Milestones:

User story:
As a Wikidata editor,
I want to avoid repeating identical labels in hundreds of languages
in order to reduce the amount of redundant content that needs to be maintained on Wikidata.

Problem:
We have many labels that are by principle identical across different languages (see examples section). This has some bad consequences:

  • editors having to create and maintain redundant content (copying the same thing to most/all languages creates massive amounts of edits and is a huge waste of resources)
  • need of storing redundant information that burdens our systems (e.g. the Query Service)

Solution:
Introduce a new language code that all languages fall back to. This will be particularly helpful for Unicode characters, Scientific articles, and Codes as well as for Names in Latin scripture (as we do not have an elaborate fallback system for that scripture yet). We will test if this solution (only one new language code) is good enough, or if we need more specific language codes after all to model a useful fallback chain.

This task

  • Adding "mul" as a new monolingual language code.
  • Have other languages fall back to it (Translatewiki fallback chain > "mul" > "en")

Community takes over

  • Community creates guidelines and help pages on how to use the new code, e.g.
    • What if one Latin-script language may prefer a form (e.g. "Philip L. Brown"), another Latin-language script another form (e.g. "Philip Larry Brown" or "Philip Brown")?
    • In what cases should the Latin-language label be used for "mul" instead of the native label (while still making sure that re-users can identify the native label via property)?
    • etc.
  • Community gives feedback after some months about how the new code and guidelines work
    • Based on the feedback we might iterate on the approach if necessary.

Ideas for the future

  • start to show a warning if someone wants to add the mul-label in a different language
  • include the experience in a possible future solution for multilingual descriptions (Abstract Descriptions)
  • re-evaluate if the final fallback to “en” is still appropriate

Mockup:

image.png (537×1 px, 170 KB)

Examples:
This will be useful in many different places:

Names

Unicode characters

Codes

Scientific articles

Translatewiki fallback chain:

Examples:
ami > zh-tw, zh-hant, zh-hans
zh-tw > zh-hant, zh-hans
zh-hant > zh-hans
zh-hans > []

de-at > de
de > []

en-gb > en
en > []

Hard-coded fallback chain:

old

  • Translatewiki fallback chain > "en"

new

  • Translatewiki fallback chain > "mul" > "en"

Community communication:

  • The interested Community needs to be aware of the new code and of the necessity to create guidelines and help pages on how to use it.
  • We need to be available for the Community when they create guidelines and to collect feedback.

Original:
This task is to add support for a "mul" language code for labels and aliases. For any benefits of this code to be properly reaped, all language codes should ultimately fall back to "mul"—which I believe would be achieved by adding it as a fallback for the "en" code.

(If it is more desirable, codes for "mul-latn", "mul-cyrl", etc. could be created, in which case e.g. only those codes using the Latin script would fall back to "mul-latn".)

Possibly related tasks: T258242 T256003 T43807

Related Objects

StatusSubtypeAssignedTask
ResolvedGoalArian_Bozorg
OpenNone
ResolvedRelease Manuel
ResolvedBUG REPORTLucas_Werkmeister_WMDE
ResolvedBUG REPORTLucas_Werkmeister_WMDE
ResolvedBUG REPORT Manuel
ResolvedBUG REPORT Manuel
Resolvedhoo
ResolvedRelease Manuel
Resolved Manuel
Resolvedhoo
Resolvedhoo
Resolved noarave
ResolvedReleaseArian_Bozorg
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedAudreyPenven_WMDE
DuplicateNone
ResolvedArian_Bozorg
OpenNone
ResolvedArian_Bozorg
ResolvedBUG REPORTNone
ResolvedArian_Bozorg
ResolvedNone
ResolvedLucasWerkmeister
ResolvedBUG REPORTArian_Bozorg
ResolvedReleaseLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
ResolvedReleaseNone
Resolved Manuel
Resolvedhoo
ResolvedMichael
Resolved Manuel
Resolved Manuel
Resolved guergana.tzatchkova
Resolved guergana.tzatchkova
ResolvedLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
ResolvedMichael
Resolvedhoo
ResolvedMichael
InvalidNone
InvalidNone
ResolvedMichael
ResolvedMichael
ResolvedLucas_Werkmeister_WMDE
DuplicateNone
Resolved Manuel
ResolvedLucas_Werkmeister_WMDE
ResolvedReleaseArian_Bozorg
ResolvedReleaseArian_Bozorg
OpenArian_Bozorg
OpenNone
OpenArian_Bozorg
OpenArian_Bozorg
ResolvedArian_Bozorg
ResolvedNone
ResolvedReleaseArian_Bozorg
ResolvedReleaseArian_Bozorg
StalledNone
OpenNone
DeclinedNone
ResolvedArian_Bozorg
OpenNone
OpenPRODUCTION ERRORNone
OpenNone
OpenNone
ResolvedEBernhardson
ResolvedBUG REPORTNone
DeclinedBUG REPORTNone
OpenBUG REPORTNone
ResolvedArian_Bozorg
OpenNone
ResolvedArian_Bozorg
ResolvedArian_Bozorg
ResolvedBUG REPORTArian_Bozorg
OpenNone
OpenNone
ResolvedFeatureMike_Peel

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Can you explain the reminder of your deletions? Above what you deleted from the description.

@Esc3300 For clarification: Lucas and I spent a lot of time yesterday on getting everything to a point where we believe it is sensible and the remaining questions are clarified. It'd be good to concentrate the discussion on those remaining points now because otherwise we can not move this forward. As there is a strong desire from several editors to get this done I want to push this to the point where we can actually pick it up.

please refrain from editing the task description while the discussion is ongoing. It is inappropriate to masquerade your personal opinion as the hard-won consensus that we are trying to achieve here.

This message (especially the latter part) is enough of a reason to undo the changes made to the ticket since Nikki's comments were added, irrespective of what opinions I may hold of any of it (which should not be assumed as was done in the diff that stains this task). The sea lion I am removing from this ticket is also free to impugn Lucas's or Lydia's credibility or emotional strength as well.

@Esc3300 For clarification: Lucas and I spent a lot of time yesterday on getting everything to a point where we believe it is sensible and the remaining questions are clarified. It'd be good to concentrate the discussion on those remaining points now

Ok. What's the proposal for the various points in how it may backfire? And finally which script do you want to start with?

please refrain from editing the task description while the discussion is ongoing. It is inappropriate to masquerade your personal opinion as the hard-won consensus that we are trying to achieve here.

This message (especially the latter part) is enough of a reason

@Mahir Can you explain which parts the later part covers? If not, please refrain from making such comments in phabricator or elsewhere.

For those who would like a clarification,

please refrain from editing the task description while the discussion is ongoing.

this is the former part of Lucas's message

It is inappropriate to masquerade your personal opinion as the hard-won consensus that we are trying to achieve here.

and this is the latter part.

(More on the "sea lion" term.)

Apparently there is a disagreement between Lucas and his manager about description editing.

Can you at least explain which parts you consider my personal opinion and which ones are not supported by a consensus (ideally with a link to the relevant discussion)?

There is no disagreement.
We are spending a lot of time discussing things that currently don't move this forward and do not help get to a meaningful consensus. So one final try. We need input on the final remaining discussion points as I laid out in T285156#7384455. Let's please concentrate on those now so that we can then update the task description once we heard everyone.

If this is the only open point, can you summarize how the open points mentioned in the task description had been addressed ?

Sure.

  • Could this solution somehow backfire? -> several answers in this thread that we will weigh and see if they warrant any action
  • What are all the mul-<script> codes that we should start with? -> none, we are just going with mul for now as I said in my comment
  • How exactly should be the fallback chain for these mul codes? -> no fallback within the mul codes because we only have one. fallback to and from other languages is in my remaining questions
  • Could this solution somehow backfire? -> several answers in this thread that we will weigh and see if they warrant any action

Can you propose something?

Step #3 mentions constraints. What will they be?

I understand that you are keen to get this done, but compared other new language codes, we are still moving quite fast. I think we all don't want this to go into a dead end.

Thank you all for your input on this! We will put this in development right after the no deploy weeks. Special thanks go to @Nikki and @Mahir256, for driving and enlightening this issue, and to @Amire80 and @Epidosis, for your valuable input!

@Esc3300: You also gave helpful input and we appreciate the effort! At the same time, your style of engagement and your continued disagreement with the direction that we took in the deliberation seems to have ultimately led to some demotivating arguments and loops in the discussion. I am sad to see that all of this resulted in a bad climate and a frustrating experience for some of the discussion's participants. It is essential for Lydia and me that - especially for hard decisions like these - we still maintain an open and welcoming climate for all people involved, as well as a worthwhile and productive discussion. This is why we would like to ask you for your help in fostering more open and welcoming discussions that respect our process in the future.

Manuel changed the subtype of this task from "Task" to "Goal".Jul 14 2022, 12:47 PM
Manuel renamed this task from Add termbox language code mul to Add termbox language code mul to reduce redundancy in Wikidata Labels and Aliases.Jul 14 2022, 12:49 PM
Manuel updated the task description. (Show Details)
Manuel renamed this task from Add termbox language code mul to reduce redundancy in Wikidata Labels and Aliases to [GOAL] Add termbox language code mul to reduce redundancy in Wikidata Labels and Aliases.Feb 22 2023, 11:53 AM
Arian_Bozorg claimed this task.