Page MenuHomePhabricator

Generic strategy to deal with high volume / expensive traffic from cloud providers
Open, Needs TriagePublic

Description

We regularly see large traffic increase from AWS (or more rarely from other cloud providers). This is particularly problematic for services that are somewhat expensive, for example full text search or SPARQL queries (but other services might be impacted in similar ways).

As an example, we've seen a doubling of the full text search traffic, almost overnight, at the end of December (see T326757). This traffic increase can be mostly attributed to traffic coming from AWS.

Cloud providers makes it very easy to overwhelm our services, which has impact on our ability to serve other requests. While our services are meant to be freely available to all purpose, we need to protect the stability of our services and ensure equitable access to all.

So far, both for Search and for WDQS, the Search Platform team has been dealing with those surge in traffic inside of our applications (with dedicated pool counters for Search or with temporary ban of traffic for WDQS). This seems like a more generic technical solution might be needed, and a more generic policy on how we want to deal with those traffic surges.

Acceptance criteria:

  • general guidance on how to deal with traffic surges from cloud providers
  • decision on whether we want a generic solution to manage traffic surges from cloud provider or if we want to deal with those at application level

Event Timeline