Page MenuHomePhabricator

[EPIC] Avoid using the elasticsearch scroll API
Open, MediumPublic

Description

As a maintainer of the search infrastructure I want the long running maintenance tasks to be resilient to node restarts so that such processes do not fail regularly.

The scroll API relies on a non persisted state maintained on the elasticsearch nodes that may disappear if the node restarts and will cause the underlying maintenance task to fail.
This problem currently affects:

One solution is to move the state to the client performing the long running task using search_after on a stable field (the page id).

AC:

  • the scroll API is no longer used by long running tasks
  • a node crash does not cause a long running task to fail