Page MenuHomePhabricator

Clean up MatchingTermsLookup::getTopMatchingTerms()
Closed, ResolvedPublic

Description

There are two related issues around the MatchingTermsLookup::getTopMatchingTerms() method:

  • There only remaining (non-mock) implementation, DatabaseMatchingTermsLookup, doesn’t really implement the method – it just returns the result of getMatchingTerms() unchanged, since the new term store doesn’t support a notion of “top” matching terms. This suggests that getTopMatchingTerms() could be removed from the interface.
  • However, in doing so, DatabaseMatchingTermsLookup actually violates the contract of the method. The interface documentation clearly states that the method should only return one TermIndexEntry per Entity, and even the mock implementation (MockMatchingTermsLookup) takes care to ensure this – but DatabaseMatchingTermsLookup will happily return multiple search results for the same entity (e. g. if there were multiple matching aliases).

Do any callers of getTopMatchingTerms() actually rely on this “distinct entities” behavior? So far, while I’ve seen a TermSearchInteractor return multiple results for the same entity in a PHP shell, I haven’t seen multiple results for the same entity in an API search response, so maybe a filter is already being applied at a higher level.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 14 2020, 3:36 PM

Okay, it’s not actually too complicated. getTopMatchingTerms() is only called by MatchingTermsLookupSearchInteractor, to implement TermSearchInteractor::searchForEntities(); that method is in turn only called by EntityTermSearchHelper, to implement EntitySearchHelper::getRankedSearchResults(); and EntityTermSearchHelper combines exact and prefix matches by sending search results into its mergeSearchResults method, which takes care to only add search results if no better result for the same entity ID is known.

In other words, no, nobody relies on the “distinct entities” behavior (and in fact there’s no such guarantee on TermSearchInteractor::searchForEntities() anyways). I believe we can remove getTopMatchingTerms() and replace calls to it with calls to getMatchingTerms().

Change 634203 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Remove MatchingTermsLookup::getTopMatchingTerms()

https://gerrit.wikimedia.org/r/634203

Change 634204 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Update expected data in MatchingTermsLookupSearchInteractorTest

https://gerrit.wikimedia.org/r/634204

Change 634203 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Remove MatchingTermsLookup::getTopMatchingTerms()

https://gerrit.wikimedia.org/r/634203

Change 634204 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Update expected data in MatchingTermsLookupSearchInteractorTest

https://gerrit.wikimedia.org/r/634204

WMDE-leszek closed this task as Resolved.Oct 16 2020, 9:15 AM
WMDE-leszek moved this task from Verification to Done on the Wikibase wb_terms leftovers 2020 board.