Page MenuHomePhabricator

Ship query completion indices from analytics to prod clusters
Open, MediumPublic8 Estimated Story Points

Description

As a developer i want query completion indices in the production search clusters so that query completions can be provided to end users

Essential parts:

  • Package up query completions into esbulk format. May be part of candidate generation script or something more general and independant (hql_to_esbulk?).
  • Airflow DAG to schedule generating completions, and using the swift integration to make them available to production. For now this can be hardcoded for a single wiki, commonswiki.
  • mjolnir bulk daemon integration for maintaining indices in the production clusters. This may only require updating the configuration for a new swift container reusing the ImportAndPromote action written for glent, but it should be verified. I suspect that limiting the use case to only support commonswiki will make this straight forward, but effort will be required if we want to move forward with more wikis.
  • a common query-clicks dataset is used as a base for this pipeline

AC:

  • query completion indices are available on production elasticsearch clusters

Event Timeline

CBogen set the point value for this task to 8.Aug 24 2020, 5:21 PM
Gehel triaged this task as High priority.Oct 28 2020, 1:28 PM
Gehel lowered the priority of this task from High to Medium.Mar 23 2022, 8:45 PM