Page MenuHomePhabricator

[Spike] Figure out how to compute facets across multiple fields
Closed, ResolvedPublicSpike

Description

We have an idea about allowing community editing of a number of annotations fields which would mirror fields added to the toolinfo.json schema versions > 1.0 (Hay's Directory's schema). To be valuable as a way to backfill information for legacy toolinfo records we would like values added via this method to contribute to any relevant facets in search results. A good example field for this is tool_type.

Basic web research indicates that the most reasonable way to make this work would be to use copy_to instructions when indexing to copy both the base record's value and the annotation layer's value to a third field that is only used for the facet computation and related result filtering.

Event Timeline

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptMar 11 2022, 11:57 PM
Restricted Application added a project: User-bd808. · View Herald Transcript
bd808 changed the task status from Open to In Progress.Mar 11 2022, 11:58 PM
bd808 triaged this task as Medium priority.
bd808 moved this task from Backlog to In Progress on the Toolhub board.

I am not in love with the code I wrote, but I have working changes in my local tree for this. The needed pieces are:

  • Add new combined field(s) to the Document with the proper typing (quite likely a KeywordField). This is needed to generate the correct mapping data for the index.
  • Add a copy_to attribute to each affected original field naming the combined field to populate. As each document is indexed this will cause the values to be copied into the combined field.
  • Change the 'field' of each affected faceted_search_field to the new combined field. This computes the facet response from the combined information.
  • Change the 'field' of each affected filter_fields to the new combined field. This restricts results to matching the combined field's values when narrowing via the facet search ui.

Change 770636 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[wikimedia/toolhub@main] annotations: Add 14 fields matching core toolinfo fields

https://gerrit.wikimedia.org/r/770636

Change 770637 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[wikimedia/toolhub@main] search: Support facets spanning core and annotation fields

https://gerrit.wikimedia.org/r/770637

Change 770636 merged by jenkins-bot:

[wikimedia/toolhub@main] annotations: Add 14 fields matching core toolinfo fields

https://gerrit.wikimedia.org/r/770636

Change 770637 merged by jenkins-bot:

[wikimedia/toolhub@main] search: Support facets spanning core and annotation fields

https://gerrit.wikimedia.org/r/770637

Change 786342 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/deployment-charts@master] toolhub: Bump container version to 2022-04-21-215651-production

https://gerrit.wikimedia.org/r/786342

Change 786342 merged by jenkins-bot:

[operations/deployment-charts@master] toolhub: Bump container version to 2022-04-21-215651-production

https://gerrit.wikimedia.org/r/786342