Page MenuHomePhabricator

Investigate where and how \Wikibase\Lib\LanguageFallbackChain is used
Closed, ResolvedPublic

Description

Background: In researching T250930: Wikibase receiving ⧼Lang⧽ from uselang parameter and using it everwhere (related: T247057) it became clear that our handling of languages is suboptimal. One far-reaching way to tackle that would be to introduce a TermLanguage class (and TermLanguageFactory) that guarantees to be a valid Term language. However, that would be a very fundamental change of a magnitude that would require an ADR. Further, it seems that most of our problems in that regard seem to stem from "invalid" languages that are provided by our own \Wikibase\Lib\LanguageFallbackChain, created by our \Wikibase\Lib\LanguageFallbackChainFactory.

So it might be possible to improve this situation by taking the following actions:

  • \Wikibase\Lib\LanguageFallbackChain and \Wikibase\Lib\LanguageFallbackChainFactory should be renamed TermLanguageFallbackChain and TermLanguageFallbackChainFactory reduce confusion with MediaWiki's LanguageFallbackChain for interface messages.
  • The factory should contain validation to ensure that the languages are only valid Term languages.

However, that is based on the following assumptions:
\Wikibase\Lib\LanguageFallbackChain is only used for Terms

It is necessary to investigate whether that assumption is correct and whether the actions above can be taken.

One place, where these fallback chains seem to be used a lot, at least intermittently, is in the \ValueFormatters\FormatterOptions. There is an extra investigation to look into how that class is used and maybe can be improved upon: T256407: Investigate where and with which keys FormatterOptions are used

(Looking into JS Term languages is not part of this investigation)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The first approach to investigating this was to look at what happens to tests when validation is introduced. The following problem was observed:

  • Some tests in Lexeme seem to use qqx to verify that the correct terms are added to messages as parameters. For that, they set mock labels/descriptions for qqx even though qqx is not a valid term language

This seems quite solvable.

Otherwise, this seems fine as far as testing goes. All the selenium tests are successful, but probably, we just aren't testing this failure mode.

Thus, some manual checking is going to be necessary, But that will take time as my IDE currently lists 100 instances of usage in production code in the extensions that I have checked out.

Checked the following extensions for their usage of LanguageFallbackChain and they look fine:

  • MediaInfo
  • WikibaseCirrusSearch
  • WikibaseLexeme
  • Wikibase view Termbox Renderer
  • Wikibase view in general
  • Wikibase repo, aside from the exception mentioned below
  • Wikibase lib
  • Wikibase client looks good as well, as far as usage is concerned

Non-compliant usages that I've found:

  • \Wikibase\Repo\Specials\SpecialMyLanguageFallbackChain

Non-compliant usages that I've found:

  • \Wikibase\Repo\Specials\SpecialMyLanguageFallbackChain

Care to elaborate on this non-compliant usage a little more?

"Non compliant" is maybe not the best wording here. But this fallback chain usually starts with the current language. If that language were removed by validation because it is not a valid term language, then that might be unexpected.

E.g. the following would change: https://www.wikidata.org/wiki/Special:MyLanguageFallbackChain?uselang=%E2%A7%BCLang%E2%A7%BD

OTOH: There seems to be already a class of invalid languages where that is the case anyway: https://www.wikidata.org/wiki/Special:MyLanguageFallbackChain?uselang=Lang: