Looking at the request durations, fetching responses for wikidata tends to be slow (> 5s for the 50% quartile). To avoid unnecessary retries, we could increase the timeout to 7s to allow slow responses, especially, since wikidata is responsible for 60% of the non-rerender fetches.
Retry logic is spread across multiple levels: envoy, http-client and flink's async operator. For the http-client it has been disabled explicitly, however, envoy still retries on 5xx upstream responses. Since envoy's retries are not transparent to the application, we might run into timeouts and loose the actual cause (5xx error). By passing a header x-envoy-max-retries: 0 (see docs) envoy won't retry automatically.
AC:
- client: fetch timeout is 7s
- client: pass x-envoy-max-retries: 0 header
- envoy retries / retries attempts rate is zero