Page MenuHomePhabricator

Move `getLanguageFallbackChainFactory` from `GenericServices`
Closed, InvalidPublic

Description

As already noted in T259783 this service is not generic, as it should rely on the wiki's content language, and therefore should be moved to SingleEntitySourceServices and MultipleEntitySourceServices or a more appropriate commonplace.

  • Service added to SingleEntitySourceServices
  • Service added to MultipleEntitySourceServices
  • GenericServices LanguageFallbackChainFactory should not be used & should be removed.

This may end up fixing some or all of T259783: LanguageFallbackChain does not end in 'en' for language codes that are not a valid format (currently in the backlog)

Event Timeline

Addshore renamed this task from Move `getLanguageFallbackChain` from `GenericServices` to Move `getLanguageFallbackChainFactory` from `GenericServices`.Aug 17 2020, 9:34 AM
Addshore updated the task description. (Show Details)

Change 620689 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Inject MediaWiki language services into LanguageFallbackChainFactory

https://gerrit.wikimedia.org/r/620689

Change 620689 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Inject MediaWiki language services into LanguageFallbackChainFactory

https://gerrit.wikimedia.org/r/620689

I think the current status of this task is that it’s completely unclear how it should be resolved. I’ll try to summarize.

There seem to be two potential sources of non-genericness in this service:

  1. The list of allowed / known language codes can vary by entity source, or indeed even by entity type. For instance, both Items and Lexemes may come from the same entity source (Wikidata), and yet Lexemes allow for some additional language codes that are not supported in Items.
  2. The final fallback language, hard-coded to 'en' in T259783, should maybe be the content language of the entity source wiki.

Neither a LanguageFallbackChainFactory nor a TermLanguageFallbackChain currently come into much contact with an entity source, entity type, or entity ID. This makes them unlike many other services in MultipleEntitySourceServices, which often receive an entity ID as input and dispatch to the correct single entity source service based on it. At the same time, it seems that even an individual TermLanguageFallbackChain is currently often used with entities of different types.

Neither of the two sources of non-genericness mentioned above is very much related to the LanguageFallbackChainFactory, which is rather concerned with looking up language codes and dealing with variants. It seems possible to leave LanguageFallbackChainFactory completely agnostic of entity sources, but instead make it create TermLanguageFallbackChain objects that directly wrap a chain of languages (with no filtering at construction time), and then later both the allowed / known language codes and the final fallback language are provided (possibly as part of the same object, by adding the final fallback language to the ContentLanguages interface) whenever a TermLanguageFallbackChain is used (e. g. $chain->getFetchLanguageCodes( $contentLanguages )). But that wouldn’t really fit into MultipleEntitySourceServices at all.

Finally, I wonder if the list of allowed / known language codes should be related to the entity source at all. Additional term languages used to be defined in site config, but for T260118 we moved them into the Wikibase code. Defining additional term languages via $wgExtraLanguageNames still works, but we could say that it’s no longer really supported by Wikibase, and that the supported language codes for a certain entity type are always the same. (See also T220798.)