Page MenuHomePhabricator

Send Retry-After header for readonly API errors
Open, Needs TriagePublicFeature

Description

Feature summary (what you would like to be able to do and where):
When an Action API request fails with a readonly error, and an estimate for the remaining read-only time is available (this should often be the case for Wikimedia production readonly periods), include a Retry-After header in the response headers. This would be analogous to the existing header for maxlag errors.

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):
API client libraries that automatically retry requests. (This includes @Legoktm’s mwapi for Rust and my m3api for JavaScript.)

Benefits (why should this be implemented?):
The server-side generally knows better than the client when the read-only period can be expected to be over. For Wikimedia production, I guesstimate 5 or 10 seconds would be a good value for maxlag-induced read-only periods, 30 seconds for a regular maintenance read-only (primary switchover), and 1 minute for a datacenter switch; but maybe this could also be set by the DBAs as part of configuring the specific read-only period, based on previous experience. (We don’t have to guarantee that the read-only period will be over by the time of the retry; the last datacenter switch wasn’t sub-1 minute yet, but I still think that would be an acceptable retry delay in that case, even if some clients will end up doing a second retry.)

Without this information, client libraries are left to guess what a good retry delay would be; m3api guesses 30 seconds (configurable by the library user), while mwapi has no default delay if I’m reading the code correctly.