Page MenuHomePhabricator

API proxy.go efficient s3 requests management to handle rate limiting issues
Closed, InvalidPublic3 Estimated Story Points

Description

The main API proxy.go NewGetLargeEntities s3 proxy function for Articles handles multiple projects concurrently by spawning mdl.Limit(capped to 10) number of worker goroutines that fire the s3 GetObjectWithContext.
In case of SlowDown errors from s3, the v1 AWS SDK (it is deprecated now) uses the DefaultRetryer that:
Retries max 3 attempts (configurable)
Applies exponential backoff: base delay + jitter (randomized delay) between retries

And, after 3 retries, it returns the error back to the worker which logs the error and continues to the next project. This implies that all the workers retry independently and unintentionally amplify throttling rather than mitigating it.
This is a major drawback for the on-demand API that inhibits its ability to reliably operate at scale.

Dynamically manage the s3 request rate by using golang.org/x/time/rate. For example, before any worker makes a s3 request it acquires the permit from the limiter which will ensure parallelism and no burst spikes in case of s3 rate limiting

TO DO

  • disable retries for this specific case
  • recreate (replay) the scenario with the info from the case
  • Acceptance criteria**