To avoid retries amplifying overload situations, we should adhere to the following rules in a client-server pair:
1) Server request timeouts are set (slightly) **shorter** than client timeouts. Server requests truly abort when the timeout is reached.
2) Servers respond with a `503` status on timeout. If a retry is permissible, the retry delay is specified in a header like `Retry-After: 120`.
3) Clients follow HTTP semantics when receiving a response with status `503`: Only retry if `Retry-After` is specified, respecting the delay.
With multiple layered servers, this works out to a staggering of timeouts, with the lowest level using the shortest possible timeouts. By waiting for the server response, clients can check the status for a 503 response, and avoid retrying altogether.