Page MenuHomePhabricator

[RFC] Create place to put common Wikibase services
Closed, ResolvedPublic1 Story Points



There is existing and new code we currently have no good place for. Examples:

  • EntityFactory. Like several other classes in Wikibase Lib it does not belong in either Repo or Client, and is not part of a cohesive whole. We want to get rid of Lib, though have no good place to either move this class to or to put a replacement.
  • ReferencedEntitiesFinder (lib)
  • EntitySearchTextGenerator (repo)
  • Often when refactoring away from problematic code, we need new services.

We have such code in Lib, Client and Repo. This creates problems for the development of Wikibase itself, and for reuse in tools.

For lack of a better place, some was also put in DataModel. A few examples of such services in DataModel:

  • PropertyDataTypeLookup (introduced because it was needed in QueryEngine, with an implementation that had to be in Wikibase.git)
  • ItemLookup (introduced by Tpt)
  • EntityDiffer

These are not part of the "data model", and are generally also not used at all by the data model implementation.

Suggested solution

We can create a new component to hold these kind of services. A good name for this has not yet been agreed upon. Suggestions so far: Wikibase Services, Wikibase DataModel Services, Wikibase DataModel Toolkit.

How would this be different from Wikibase Lib?

We want to get rid of Lib, so what would make this new component any better?

Wikibase Libs suffers from problems in (at least) two different categories:

  • Technical problems such as defining global state, lack of proper boundaries and depending on things in Client or Repo
  • Lack of a clear responsibilities and contracts. When Lib was created, it's implicit contract was something like "everything that is used by repo and client, might be used by both of them at some point, or might be usable by something else".

The later point is one of the main causes of Lib becoming an uncohesive mess, which after years of agreement that it should die, is still there. This new component would clearly define what should go in there and what not, and would additionally also make this more explicit for Wikibase DataModel.

Classes and interfaces need to satisfy these rules to be in the component:

(We can of course make pragmatic exceptions when we want.)

  • They need to be services that do something with the DataModel. If an interface does not have DataModel types in it, then it should go elsewhere. "infrastructure code" does thus not quality. Example of "infrastructure code": the "Reporting" classes and interfaces in Wikibase Lib
  • They do not belong to a more specific component. For instance, serializers should go into the serialization component. This also applies to cohesive groups of behaviour for which there is no component yet, such as entity storage.
  • They do not have dependencies on services (ie database) or additional components. This is to not pull in tons of dependencies for everyone using the new component. One exception to this rule is already clear: the entity diffing and patching code. This depends on Diff. A dedicated component could be created for it, though that seems to not justify itself. (And if this code is moved out of DataModel, we decrease binding by no longer having DataModel depend on Diff.)
What does creating such a component entail?

Once the empty foundation for the component is created, we can add it as a dependency for Wikibase.git. After this, we can move out those classes from Lib, Repo and Client that satisfy the conditions. This can be done incrementally, or in big chunks, depending on what we prefer. At this stage we already are able to put new such services in a nice location.

Moving out the things that are already in DataModel is more tricky. This will be a big breaking change that will affect multiple users of the component. We already have a big compatibility break in DataModel 3.x, so it might be confusing to bundle these. Hence waiting with such a change till DataModel 4.x is probably better.

Event Timeline

JeroenDeDauw raised the priority of this task from to Needs Triage.
JeroenDeDauw updated the task description. (Show Details)
JeroenDeDauw added a project: Wikidata.
JeroenDeDauw added a subscriber: JeroenDeDauw.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 24 2015, 12:31 PM

I think the following interfaces might be candidates for moving there

  • Wikibase\Lib\EntityIdFormatter
  • Wikibase\Lib\SnakFormatter
  • Wikibase\Lib\Store\EntityLookup
  • Wikibase\Lib\Store\LabelLookup
Lydia_Pintscher triaged this task as Normal priority.Mar 30 2015, 10:43 AM
Lydia_Pintscher added a subscriber: Lydia_Pintscher.
Tobi_WMDE_SW renamed this task from Create place to put common Wikibase services to [RFC] Create place to put common Wikibase services.May 5 2015, 5:49 PM
Tobi_WMDE_SW edited a custom field.May 25 2015, 11:00 AM

@JeroenDeDauw sorry, missed you when writing the email about this. Ignore the 1 point for now, it was for getting Phragile working with our current sprint.

We just discussed in Wikidata-Sprint-2015-06-02 that people should make explicit what they think. So here we go:

I'm sorry to say that, but the problem description does not enlighten me much. What I read here is that having a generic Lib directory is not good. But the question why exactly is not answered. And won't this problem be the exact same with the proposed "Toolkit" component?

From what I read here the two kinds of problems described later don't get better with the proposed component:

  • Global state can be avoided. Stuff that breaks boundaries can be fixed. Stuff depending on things in Client or Repo can be fixed.
  • It's still not obvious to me how replacing a Lib directory with a "Toolkit" component makes responsibilities more obvious.

The set of rules sounds good. There is enlightenment. Thanks for that. However, the resulting component still feels like a "Lib2" folder to me, just with some more strict boundaries (basically: you are not allowed to pull in dependencies).

This really is not much more but a generic "Wikibase DataModel Services" component.

My personal conclusion: Let's do it. I think it's making the situation better. We may need to move classes again later, but this is not much worse than what we currently do.

I agree on the "wait for DataModel 4.x" suggestion.

We decided on the name Wikibase DataModel Services.
We decided to not block this on other discussions on how to split stuff between components.
Next steps:

  • Create component
  • Start migrating things there and use the component

I suggested to create 3 tickets that can be tracked in sprints:

  1. Create the component.
  2. Move a bunch of classes from Lib the new component.
  3. Move a bunch of classes from DataModel to the new component.
daniel added a subscriber: daniel.Jun 10 2015, 3:53 PM

Does "create component" mean a separate git repo right away, or just a separate directory for now?

How is "Wikibase DataModel Services" different from lib?

It means a separate git repo right away and something one can require with composer.

How is "Wikibase DataModel Services" different from lib?

We discussed this in detail (unfortunately without you) and it also is in the description above. Please poke me and I will try to explain in person.

A separate directory is not possible, as this is not just about things in Wikibase.git. I find it quite annoying to keep repeating myself, and would much appreciate people reading the description before asking questions it already answers.

JeroenDeDauw closed this task as Resolved.Jun 29 2015, 11:29 AM
JeroenDeDauw claimed this task.