Use dispatching services in Client
Open, HighPublic

Description

This is a general ticket for using servics allowing to access entities from multiple repositories in Wikibase client. Tickets for more specific implementations related tasks should be created as subtasks.

General structure/notes:

  • ClientStore implementation is not changed,
  • existing ClientStore implementation, ie. DirectSqlStore should be using dispatching versions of services instead of single-repo ones it uses now
  • A new factory, ServiceMapFactory, should be created for providing maps of service instances for each service name. DirectSqlStore will for now use these service maps to create the dispatching service instance for each service name. [TBD: ServiceMapFactory could have the knowledge/wiring for constructing the dispatchign services from the maps!]
  • This factory would have a public interface partly reminding the one of ServiceContainer, ie. it would have a generic method getServiceMap (name is subject to change) returning a map in form [ repoName => service ] of services configured for particular repository.
    • It could also have convenience getters for particular services [probably not needed, since this class is used only from inside DirectSqlStore].
  • This factory has access to configuration of each foreign repository (what repos are configured, what are DB names, URLs etc).
  • ServiceMapFactory will know one ServiceContainer for each repo. When asked for the service map for service "foo", it will construct the map by asking each service container for the service "foo".
  • The ServiceContainer container will use a wiring file for instantiating the per-repo services. The instantiator functions in the wiring file need access to the configuration setting describing the target repository.

Implementation notes:

  • services that return entities coming from a foreign repo, in particular implementations of EntityRevisionLookup, must apply the correct ID mapping for that repo during deserialization.
  • for managing a set of service instances for a given repo, consider using a ServiceContainer. We should consider changing WikibaseClient to use core's new ServiceContainer mechanism alltogether.
  • the ability to load e.g. labels for entities on another repo will be needed by repo code, not just client code. We will have to re-consider the relationship between repo and client once more.
  • in the long run, WikibaseClient should also become a ServiceContainer but this is something which out of scope of this task

Related Objects

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 14 2016, 12:05 PM
WMDE-leszek renamed this task from Used dispatching/multiplexing services in Client to Use dispatching/multiplexing services in Client.Oct 18 2016, 12:50 PM
daniel updated the task description. (Show Details)
daniel added a subscriber: daniel.

What's the status of this?

this is on very early stage still. Basically I created this ticket as a ticket for https://gerrit.wikimedia.org/r/#/c/315904/
But that patch is still a very work--in-progress. And apart from the general approach being far from clear, there are only three dispatching services prepared so far. So I am thinking we should possibly create some other ticket for patch I linked, and make this ticket some kind of a tracking ticket?

Change 315904 had a related patch set (by WMDE-leszek) published:
[WIP] Use dispatching services in client

https://gerrit.wikimedia.org/r/315904

daniel added a comment.EditedNov 1 2016, 5:33 PM

So we somehow need to apply the prefetching capabilities of a TermBuffer to a multi-repo scenario. In particular, prefetchTerms() needs to partition the list of EntityIds by repo, and then forward each list to an underlying, per-repo TermBuffer.

Implementation:

Make a dispatching TermBuffer. It can be based on EntityTermLookupBase, just like BufferingTermLookup. When forwarding prefetchTerms(), the list of EntityIds is partitioned by repo. The dispatching TermBuffer does not need to implement caching.

The dispatching TermBuffer would get a list of TermBuffer (by repo), and it would extend EntityTermLookupBase; getTermsOfType() would need to be implemented on top of prefetchTerms() and then reading the desired terms from the buffer.

Caching will continue to be done by per-repo instances of BufferingTermLookup. This seems sensible since cache location and duration may depend on the target repo (e.g. more aggressive caching for repos accessed via API).

There is no need to create a dispatching TermLookup for this to work. I can't think of a use case for a plain DispatchingTermLookup, though we might end up needing it somewhere.

Ladsgroup moved this task from Proposed to Doing on the Wikidata-Sprint board.Nov 18 2016, 10:28 AM

Change 323404 had a related patch set uploaded (by Jakob):
Integrate ForeignEntityValidator with ValidatorBuilders.

https://gerrit.wikimedia.org/r/323404

WMDE-leszek renamed this task from Use dispatching/multiplexing services in Client to Use dispatching services in Client.Dec 6 2016, 10:58 AM
WMDE-leszek raised the priority of this task from Normal to High.
WMDE-leszek updated the task description. (Show Details)
WMDE-leszek added a subscriber: Jakob_WMDE.

Updated the task description to include results of reallife discussion with @daniel and @Jakob_WMDE. This ticket will be more of a tracking ticket now.
Also raised the priority to High.

daniel updated the task description. (Show Details)Dec 6 2016, 11:44 AM
daniel updated the task description. (Show Details)Dec 6 2016, 11:53 AM

Change 325967 had a related patch set uploaded (by WMDE-leszek):
[WIP] Add RepositoryServiceContainer and the wiring file

https://gerrit.wikimedia.org/r/325967

Change 325968 had a related patch set uploaded (by WMDE-leszek):
[WIP] Add DispatchingServiceFactory

https://gerrit.wikimedia.org/r/325968

Change 325969 had a related patch set uploaded (by WMDE-leszek):
Use dispatching services in DirectSqlStore

https://gerrit.wikimedia.org/r/325969

Change 326475 had a related patch set uploaded (by WMDE-leszek):
Use dispatching TermBuffer in WikibaseClient

https://gerrit.wikimedia.org/r/326475

Change 315904 abandoned by WMDE-leszek:
[WIP] Use dispatching services in client

Reason:
Abandoning this in favour of I8009728707a0ca741404895593d067c8fb0f3df4 and Iebd9994b57b9248b1f9a7869f51eb28e6c2eeace

https://gerrit.wikimedia.org/r/315904

Change 325967 merged by jenkins-bot:
Add RepositoryServiceContainer and the wiring file

https://gerrit.wikimedia.org/r/325967

Change 325968 merged by jenkins-bot:
Add DispatchingServiceFactory and the wiring file

https://gerrit.wikimedia.org/r/325968

Change 325969 merged by jenkins-bot:
Use DispatchingEntityRevisionLookup in DirectSqlStore

https://gerrit.wikimedia.org/r/325969

Change 326475 merged by jenkins-bot:
Use dispatching TermBuffer in WikibaseClient

https://gerrit.wikimedia.org/r/326475

WMDE-leszek moved this task from Doing to Monitoring on the Wikidata-Sprint board.Feb 13 2017, 3:42 PM

The general structure has been introduced. There are still some open questions (e.g. T153437) and not-very-urgent things that should be improved (e.g. in the long run RepositoryServiceContainers should not rely on WikibaseClient instance but instantiate all needed service themselves). Moving this task to "Monitoring" column so we don't lose until there are open tasks related to the general idea.
All possible further refactoring, configuration, etc tasks related to DispatchingServiceFactory and RepositoryServiceContainer should be possibly linking this ticket so we track progress on this.