Page MenuHomePhabricator

[Story] Language fallback for Lua
Closed, ResolvedPublic

Description

When a label is not available in the language of the wiki that uses a Lua function a fallback should be applied. The label in an appropriate other language should be shown.

Event Timeline

Lydia_Pintscher updated the task description. (Show Details)
Lydia_Pintscher raised the priority of this task from to Normal.
Lydia_Pintscher changed Security from none to None.
Lydia_Pintscher removed a subscriber: Unknown Object (MLST).
aude added a subscriber: aude.Dec 9 2014, 4:11 AM

this is already done for the content language (only at the moment) if it is a variant language, with mw.wikibase.getEntity.

https://gerrit.wikimedia.org/r/#/c/178422/ fixes this for mw.wikibase.label convenience method for variant languages (this worked before).

as with the parser function, i don't know what kind of fallback we could further apply (user specific stuff is out of the question, due to caching). For mw.wikibase.getEntityObject (or whatever else provides full access to the entire entity data), all the language code => label (or description) could have fallback applied by default for the value.we might / probably want to provide additional information in the serialization structure that says the source language is and for a given target language (e.g. zh-tw) it might have multiple possible fallbacks (one being preferred, but others also valid part of the chain).

For mw.wikibase.getEntityObject (or whatever else provides full access to the entire entity data), maybe we also want to provide an option or method where fallback is not applied and the user gets the explicit labels / descriptions that are actually set for the languages.

aude added a comment.Feb 10 2015, 1:48 AM

I looked further into how fallback is working in lua and find that content language is always preferred (if label present) over any variants regardless if variants also have labels and variants are requested.

e.g. when requesting client page with variant=ku-arab (ku-arab + ku labels on the connected item), I get the ku label that has been transliterated when using mw.wikibase.label.

If I use variant=ku, have no ku label but have ku-arab, then I get ku-arab label + transliteration when using mw.wikibase.label.

with the way parser cache works and given the comments in Scribunto_LuaWikibaseLibrary, the way this all works might be a necessity and what we want, unless we want to further split parser cache by variant language.

// For the language we need $wgContLang, not parser target language or anything else.          
// See Scribunto_LuaLanguageLibrary::getContLangCode().
aude added a comment.Feb 10 2015, 1:52 AM

to clarify, I have to request variant=ku-latn to get the arabic -> latin transliteration. ku shows mixed scripts. (no conversion)

aude added a comment.Feb 10 2015, 2:49 AM

looks like lua and parser function handle preferred fallback / variant language differently when there are labels in all or multiple variants.

With the following, if the referenced item has labels in both ku and ku-arab and I use variants=ku-arab:

{{#property:P2}}

{{#invoke:Wikidata|formatStatements|property=P2}}

I get:

  • ku-arab label for parser function
  • transliterated ku label for lua.

https://gist.github.com/filbertkm/896ebff99c03f1744375 (my Wikidata module, sourced from ruwiki lua module)

also looks like the parser cache is maintaining the variant label differences in the client page for ku-arab vs. ku-latn vs. ku, as there is some split.

Jonas renamed this task from language fallback for Lua to [Story] Language fallback for Lua.Sep 10 2015, 3:46 PM
hoo added a subscriber: hoo.Oct 2 2015, 5:38 PM

Confirmed that this is still an issue.

Lucie added a subscriber: Lucie.May 17 2016, 10:13 AM

This is apparently still an issue

Lucie assigned this task to hoo.May 27 2016, 12:04 PM
hoo added a comment.May 27 2016, 2:56 PM

I just investigated this a little:

We only use LanguageFallbackChainFactory::FALLBACK_SELF | LanguageFallbackChainFactory::FALLBACK_VARIANTS in all places, except for mw.wikibase.getEntity():formatPropertyValues( … ) where we use the default (the default is fallback to all fallback languages) (by omitting the field in the FormatterOptions passed to it in Scribunto_LuaWikibaseEntityLibrary::newImplementation). I presume this happened by accident.

Next steps:

  • Use LanguageFallbackChainFactory::FALLBACK_ALL for everything (in Lua/ the parser function), but create a setting to disable that.
  • Switch Scribunto_LuaWikibaseEntityLibrary::newImplementation to not use the default, but explicitly supply the fallback chain to use.

Change 291502 had a related patch set uploaded (by Hoo man):
Use FALLBACK_ALL for all data access functionality

https://gerrit.wikimedia.org/r/291502

hoo raised the priority of this task from Normal to High.May 28 2016, 1:38 PM

Change 291502 merged by jenkins-bot:
Use FALLBACK_ALL for all data access functionality

https://gerrit.wikimedia.org/r/291502

hoo closed this task as Resolved.Jun 6 2016, 3:01 PM
hoo removed a project: Patch-For-Review.