Create a dispatching version of TermSearchInteractor
Closed, ResolvedPublic

Description

Scenario:

  • a Wikibase repository wants to use Properties from Wikidata, but locally defined Items

Solution:

  • the TermSearchInteractor underlying the wbsearchentities API module should know where to search for which kind of entity.

Implementation:

  • Implement a TermSearchInteractor that knows a target TermSearchInteractor for each entity type. Each such underlying TermSearchInteractor is configured to access a specific wiki.
  • Care must be taken that the correct ID mappings are applied when constructing the TermSearchResults.
  • The UI code remains completely oblivious to federation

Caveats:

  • we do not support searching for the same kind of entity on multiple repos, but this will likely be needed in the future.
  • merging search results from different repos is tricky, because relevance scores are not comparable between instances
  • Eventually, TermSearchResults should get a new field identifying the repository on which a match was found.
daniel created this task.Nov 16 2016, 12:39 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 16 2016, 12:39 PM

Change 322118 had a related patch set uploaded (by WMDE-leszek):
Add DispatchingTermSearchInteractor

https://gerrit.wikimedia.org/r/322118

WMDE-leszek moved this task from proposed to doing on the WMDE-TLA-Team board.
Ladsgroup moved this task from Proposed to Doing on the Wikidata-Sprint board.Nov 18 2016, 10:28 AM

Regarding this part:

Eventually, TermSearchResults should get a new field identifying the repository on which a match was found.

If I get it right, it could already be fetched from the matched entity id. It makes sense to expose the repository name, and would be easy to have it, e.g. simply by adding TermSearchResult::getRepositoryName. Do we still want to have a field for this in TermSearchResult, along with the public method returning the repository name?

Change 322254 had a related patch set uploaded (by WMDE-leszek):
Expose the name of the repository on which TermSearchResult was found

https://gerrit.wikimedia.org/r/322254

Most of this is done now. I'm wondering whether we want to explicitly expose the repo on which a match was found via the wbsearchentities API module. It's implicit in the entity IDs, but the client should not need to decode the IDs in order to indicate where a match was found.

Note that this will not be necessary for the baseline, where entity types are bound to repositories. But when we support using e.g. properties from different repos, we may want to indicate which property comes from where, to allow the user an informed choice of which vocabulary to use.

Change 322118 merged by jenkins-bot:
Add DispatchingTermSearchInteractor

https://gerrit.wikimedia.org/r/322118

Change 322254 merged by jenkins-bot:
Expose the name of the repository on which TermSearchResult was found

https://gerrit.wikimedia.org/r/322254

@daniel: I agree, it makes a lot of sense to have it explicitly there. I've filed T152083 to track this (although, as you said, it is not top-priority thing for now).

daniel closed this task as Resolved.Dec 5 2016, 5:50 PM
Ladsgroup moved this task from Doing to Done on the Wikidata-Sprint board.Dec 6 2016, 1:05 AM