Page MenuHomePhabricator

[ES-M2] Investigate ways to implement language fallback
Closed, ResolvedPublic1 Estimated Story Points

Description

As per our technical decision to continue and extend the Entity Schema extension in a separate implementation to the Wikibase extension's codebase, we should decide how we would like to approach the implementation of language fallback for Entity Schema page titles and other displayed denotations such as search results.

Of the ways we can implement language fallback for Entity Schemata, three different options stand out:

  • Rewrite a similar implementation to Wikibase's Language fallback mechanism within the Entity Schema extension (ensuring that we also improve on the existing approach, rather than simply copying and pasting code).
  • Find a way to hook into specific points in the Wikibase language fallback mechanism itself, and registering hook handlers in entity schema to enable such implementation.
  • Extract the language fallback logic itself into a PHP package, for reuse within Entity Schema (and other potential future implementations)

This list is of course not exhaustive, and any additional approaches should be discussed as well. Most importantly, we would like to examine these approaches with the following points:

  1. What impact does each approach have on the general state of coupling between various classes and code within, and between the Entity Schema and Wikibase extensions
  2. How does each approach affect our ability to make future modifications to the code per use case, when those are made necessary?
  3. How does each approach affect our capability to make autonomous changes to the Entity Schema (and any new resulting codebase) as a team?
  4. How does each approach affect our ability to onboard new engineers to the codebase?

Any additional points of comparison between approaches are welcome as additions, so long as we can apply the comparison to all examined approaches.

Acceptance Criteria

  • An investigation report is made available to the team, detailing the above information.

Event Timeline

ItamarWMDE renamed this task from [ES-M2] Investigate ways to implement wikibase language fallback to [ES-M2] Investigate ways to implement language fallback.Mar 14 2023, 1:27 PM
ItamarWMDE updated the task description. (Show Details)

Prio Notes:

  • Impact areas: Reusability, Modifiability, Analyzability
  • Does not affect production
  • Does affect development efforts
  • Does not directly affect onboarding efforts
  • Does affect additional stakeholders (Namely other LOD teams)

For my (still ongoing) work on the language fallback chain (around language code mul, see T312097) I found these links helpful:

Maybe they are helpful for you as well!

Sprint 6 Planning - Notes
Timebox = 1 week for investigation +2-3 days for report creation
Suggestion: 2-3 people take on the investigation and then consolidate their findings into the report, in which case the 10 days would be mean less in investigation and more in report creation

@ItamarWMDE , was this an investigation you wanted to be directly working on?

When this ticket is picked up, please contact the Wikibase Product Platform team, to collaborate on the investigation.

Change 903663 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/EntitySchema@master] POC: Use LanguageFallbackChainFactory in EntitySchemaSlotViewRenderer

https://gerrit.wikimedia.org/r/903663

Sprint 6 Planning - Notes
Timebox = 1 week for investigation +2-3 days for report creation
Suggestion: 2-3 people take on the investigation and then consolidate their findings into the report, in which case the 10 days would be mean less in investigation and more in report creation

@ItamarWMDE , was this an investigation you wanted to be directly working on?

I think it would be great to look at it with @Michael and @noarave can join us after.

Alright. I had a quick look at the code and wrote down my rough thoughts for each of the questions in P46676. Let me know when you want me to share them.


EDIT 2023-04-25: My notes from the paste:

Whatever approach we take, it makes sense to encapsulate it behind a service in EntitySchema with a simple interface. That way, basically all the ES logic will be entirely unaffected.

Copy'n'PasteUse existing directlyExtract to Library
CouplingCoupling to Wikibase by using the WikibaseRepo::getTermsLanguages() service as opposed to WikibaseRepo::getLanguageFallbackChainFactory() service.Coupling to Wikibase by using the WikibaseRepo::getLanguageFallbackChainFactory() serviceCoupling to Wikibase by using the WikibaseRepo::getTermsLanguages() service as opposed to WikibaseRepo::getLanguageFallbackChainFactory() service. But other uses for that library could, in principle, use other languages as the base.
future modifications of fallback codeWe will have to keep our "copy" of the code in sync with changes made to Wikibase Fallback that are also desired in other instances that are supposed to follow fallback behavior of Wikibase Terms. Though those changes are done by another team, and we have to 1) notice, 2) understand and 3) redo them. => multiple sources of truth for the identical behavior logic.Will likely be done by another team, our behavior stays in sync automatically.Will be done by another team. Depending on the way it is implemented, we either have to make sure to update our dependency on the library (git submodule), or have it done automatically (wikimedia vendor)
future modifications of ESunaffected as fallback is encapsulated behind a ES serviceunaffected as fallback is encapsulated behind a ES serviceunaffected as fallback is encapsulated behind a ES service
onboardingThey will have to learn two (or more) possibly subtly diverging but very similar implementations.If needed, they'll learn one implementation in the context where it was developed.If needed, they'll learn one implementation in the encapsulated context of a library

Looking those options over, it makes sense for me to start using the Wikibase service directly to get the fallback chain. Given that we "the fallback behavior of Item Terms" not just behavior that happens to be the same, this seems to be the best way forward. In the future, if there is ever a usecase that needs the fallback logic, but with a different set of languages, then extracting that code into a library makes sense. Though even then one has to consider whether we want to keep using the Wikibase service, given that we want the Wikibase fallback behavior, not merely just behavior that is identical to that.

Looking at the above (great summary btw) i wonder if we should, under the "coupling" section, specify a possibility of not coupling to wikibase to extract term languages. Maybe we could imagine finding a way to extract the Wikibase term language list to a location where it could be independently maintained and available for Wikibase as well as other extensions to use.

I understand this may be unreasonable as a trade off, but i think accepting that we are bound to Wikibase for extracting the term languages in all approaches makes the question redundant (since, if we are already coupled with Wikibase to begin with, then using the existing mechanism is already the easiest).

We have this great comparison table thanks to @Michael:

Copy'n'PasteUse existing directlyExtract to Library
CouplingCoupling to Wikibase by using the WikibaseRepo::getTermsLanguages() service as opposed to WikibaseRepo::getLanguageFallbackChainFactory() service.Coupling to Wikibase by using the WikibaseRepo::getLanguageFallbackChainFactory() serviceCoupling to Wikibase by using the WikibaseRepo::getTermsLanguages() service as opposed to WikibaseRepo::getLanguageFallbackChainFactory() service. But other uses for that library could, in principle, use other languages as the base.
future modifications of fallback codeWe will have to keep our "copy" of the code in sync with changes made to Wikibase Fallback that are also desired in other instances that are supposed to follow fallback behavior of Wikibase Terms. Though those changes are done by another team, and we have to 1) notice, 2) understand and 3) redo them. => multiple sources of truth for the identical behavior logic.Will likely be done by another team, our behavior stays in sync automatically.Will be done by another team. Depending on the way it is implemented, we either have to make sure to update our dependency on the library (git submodule), or have it done automatically (wikimedia vendor)
future modifications of ESunaffected as fallback is encapsulated behind a ES serviceunaffected as fallback is encapsulated behind a ES serviceunaffected as fallback is encapsulated behind a ES service
onboardingThey will have to learn two (or more) possibly subtly diverging but very similar implementations.If needed, they'll learn one implementation in the context where it was developed.If needed, they'll learn one implementation in the encapsulated context of a library

In addition to this:

  • It's possible to eliminate coupling with Wikibase by extracting term languages to a new package. (Thanks for mentioning @noarave)
  • When we encapsulate this fallback by creating a service interface, we can use it directly immediately and we can wait until the language fallback is extracted to a new package, then we can update our service without any side effects.

Results

Using 'existing directly' is the decision made from the team

image.png (195×1 px, 42 KB)