Page MenuHomePhabricator

Implement the enrichment function
Closed, ResolvedPublic

Description

The indexable document have to be fetched from the CirrusSearch doc API using the flink AsyncIO operator.

AC:

  • The CirrusSearch doc API is called
  • Failures are retried 3 to 4 times
  • Missing revisions are identified and retried as long as the event is younger than 10sec (eventual consistency heuristic).
  • Error reasons are capture in a side-output

Event Timeline

pfischer changed the task status from Open to In Progress.Sep 14 2022, 7:12 AM
pfischer claimed this task.
pfischer triaged this task as Medium priority.

Change 832379 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[search/cirrus-streaming-updater@master] T317611 implement enrichment function

https://gerrit.wikimedia.org/r/832379

Change 842855 had a related patch set uploaded (by DCausse; author: Peter Fischer):

[search/cirrus-streaming-updater@master] Pass enriched revision-create to updater

https://gerrit.wikimedia.org/r/842855

pfischer reassigned this task from pfischer to Gehel.

Followup work (moving the schema out of the producer into the schema registry) will be done as part of T317202.

Change 842855 merged by jenkins-bot:

[search/cirrus-streaming-updater@master] Pass enriched revision-create to updater

https://gerrit.wikimedia.org/r/842855