Page MenuHomePhabricator

[L] Implement superclass "concept chips" in the MediaSearch interface
Closed, ResolvedPublic

Description

User story: As a user searching for media on Commons, I want a way to see a group of related and specific queries, so that I can discover additional related media.

We have this:
Concept chips exist in a prototype state, and only for a few hardcoded search terms.

We want this:

  • when a user does a search
  • and the search term corresponds to a wikidata item
  • and that wikidata item has a "subclass of" property
  • then display a concept chip for each of the wikidata items the search term is a subclass of ... clicking on a concept chip brings the user to a mediasearch for that item's label in the user's language

Other acceptance criteria:

  • If the search term corresponds to a wikidata item with a "subclass of" property, the concept chip(s) should show even if the original search term would provide zero results
  • A new ticket is created to account for the rest of the heuristics in the spreadsheet - either one ticket for all of them, or a separate ticket for each

Suggested implementation:

  • load the page as normal, then from the client side hit a new api endpoint to get the concept chip link, and then show it with client side code

(once this first concept chip is done then look into implementing the others in this list https://docs.google.com/spreadsheets/d/1KFuJhx-vQ4fyDuok_0Se1BcSEO88TGoiMzux1E4SY6c/edit?ts=5efcfa8d#gid=0 (can be implemented in parallel once the backbone is set up for this ticket))

Design
Although having small images per chip would be ideal to help illustrate the concept, if we are limited on time and resources and cannot implement that at this time, we can start with a chip that doesn't have images. (Out of scope for this ticket)

With image

concept_chip_image.png (1×2 px, 2 MB)

No image (Specs are in pink and shown in pixels)

concept_chip_no_image.png (1×2 px, 1 MB)

During development, please test the following:

  • Test this feature while logged in AND logged out
  • Test this feature on at least one mobile browser

Event Timeline

Cparle renamed this task from Fully implement "concept chips" in the MediaSearch interface to Implement superclass "concept chips" in the MediaSearch interface.Aug 7 2020, 4:03 PM
Cparle updated the task description. (Show Details)
Cparle added subscribers: Cparle, CBogen.
CBogen renamed this task from Implement superclass "concept chips" in the MediaSearch interface to [L] Implement superclass "concept chips" in the MediaSearch interface.Aug 26 2020, 4:25 PM
CBogen updated the task description. (Show Details)

Change 628319 had a related patch set uploaded (by Anne Tomasevich; owner: Anne Tomasevich):
[mediawiki/extensions/WikibaseMediaInfo@master] Add concept chips for superclasses

https://gerrit.wikimedia.org/r/628319

Change 629382 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[mediawiki/extensions/WikibaseMediaInfo@master] Add API endpoint for related terms (for concept chips)

https://gerrit.wikimedia.org/r/629382

Most heuristics in spreadsheet have been implemented (strikethroughed)

Those I haven't implemented are:

  • alias suggestions, because aliases are likely to be internalized in the search algorithm already, so wouldn't make a difference
  • suggestions with additional terms ('logo', 'icon', ...), because search results are likely not going to get much better with multiple terms

Additional heuristics ideas? :)

EDIT: looks like 'logo' results aren't too bad - worth exploring further

Another thing we should be mindful of here is that concept chips are search-term-specific, not tab-specific. In the current implementation I've included concept chips on all of the media tabs (everything except Pages and Categories), so if you switch tabs you'll see the same concept chips on the new tab, and new chips will be fetched for searches on any media tab. This might add some confusion if, say, logo-related suggestions are showing up on the audio tab. I'm not sure if that'll actually be an issue, just another thing to look out for as we're testing this.

And one last thing to remember: we chose this heuristics implementation because it was the only viable one in a short amount of time.
It is flawed to some extent both in results (Wikidata ontology can be weird and inconsistent) and performance (we should be careful with these SPARQL queries)
This implementation is fine for limited usage on Special:MediaSearch and will allow us to evaluate the feature, but if we ever want to scale it up (to the point of it being served by default for most casual searches), we will have to consider alternative implementations.
In other words: let's instrument this, and if it's found to be used extensively, we should start working on more sustainable alternatives.

Change 630896 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[operations/mediawiki-config@master] [WikibaseMediaInfo] Add config for related terms API

https://gerrit.wikimedia.org/r/630896

Change 629382 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Add API endpoint for related terms (for concept chips)

https://gerrit.wikimedia.org/r/629382

Change 628319 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Add concept chips UI

https://gerrit.wikimedia.org/r/628319

Change 630896 merged by jenkins-bot:
[operations/mediawiki-config@master] [WikibaseMediaInfo] Add config for related terms API

https://gerrit.wikimedia.org/r/630896

Change 635567 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[operations/mediawiki-config@master] [WikibaseMediaInfo] Fix concept chips array nesting structure

https://gerrit.wikimedia.org/r/635567

Change 635567 merged by jenkins-bot:
[operations/mediawiki-config@master] [WikibaseMediaInfo] Fix concept chips array nesting structure

https://gerrit.wikimedia.org/r/635567

Mentioned in SAL (#wikimedia-operations) [2020-10-21T18:12:34Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: 45312d359442d274e83deb7be80f86e12fb9e864: [WikibaseMediaInfo] Fix concept chips array nesting structure (T256431) (duration: 01m 05s)

Checked in commons wmf.14.

The screenshots are for illustration - no issues.

few concept chipsmore concept chipsmultiple rows of concept chips
Screen Shot 2020-10-21 at 2.56.44 PM.png (521×1 px, 419 KB)
Screen Shot 2020-10-21 at 3.00.33 PM.png (450×1 px, 481 KB)
Screen Shot 2020-10-21 at 3.09.56 PM.png (492×1 px, 264 KB)
with Quickview
Screen Shot 2020-10-21 at 5.38.51 PM.png (703×1 px, 888 KB)

Some cases where the search relevance is not optimal (will document them for future testing).

(1) |a last name is returned for a general term - "snow" -

Screen Shot 2020-10-21 at 3.03.32 PM.png (345×1 px, 51 KB)

(2) search for not-so-general terms - "oak branch" - results in a specific concept ship
Screen Shot 2020-10-21 at 3.08.22 PM.png (448×1 px, 430 KB)

(3) search for terms with some punctuation - "kitten;"- results in a correct search result set but the concept chip looks not so relevant.
Screen Shot 2020-10-21 at 3.02.24 PM.png (412×817 px, 313 KB)

Mobile

one concept chipseveral concept chipsmany concept chips
Screen Shot 2020-10-23 at 1.11.09 PM.png (618×384 px, 161 KB)
Screen Shot 2020-10-23 at 1.10.31 PM.png (618×387 px, 57 KB)
Screen Shot 2020-10-23 at 12.17.05 PM.png (623×376 px, 56 KB)

Checked in commons wmf.16.
The following case

If the search term corresponds to a wikidata item with a "subclass of" property, the concept chip(s) should show even if the original search term would provide zero results

is illustrated with this screenshot:

Screen Shot 2020-11-16 at 1.20.51 PM.png (547×911 px, 61 KB)