Page MenuHomePhabricator

[Search Update Pipeline] Fetch: Handle Timeout of AsyncAwaitOperator
Closed, ResolvedPublic

Description

Currently, we have timeouts at multiple levels: HTTP requests (socket write + read) and the AsyncWaitOperator wrapping those HTTP requests:

  • TransformOperator
    • AsyncWaitOperator
      • RetryingAsyncFunction
        • BypassingCirrusDocFetcher […] HttpClient
        • LagAwareRetryPredicate

Since the LagAwareRetryPredicate only caps the number of retries for late events, it will retry indefinitely until RetryingAsyncFunction times out. Timing out results in a TimeoutException which, since it does not get handled, crashes the application.

AC

  • fetch_error schema: remove restriction on error_type (it only destroys information)
  • RetryingAsyncFunction class: override/implement org.wikimedia.discovery.cirrus.updater.common.graph.RetryingAsyncFunction#timeout so it completes with a FetchResult.fromError
  • ConsumerApplicationIT verifies output routed to fetch_error stream/topic

Event Timeline