"As a Systems Administrator, I want to limit the rate of API calls per client developer, so I can allocate network and computing resources."
The System Administrator should be able to configure the following options:
- categories of client ID (OAuth), on a per-installation basis, like "beginner", "trusted", "own", "partner"
- API calling rate limits per category. A rate limit consists of a number of API calls (e.g. 100000) and a time duration in seconds (e.g. 86400 for one day). A rate limit of null means there is no limit.
- A global rate limit (number of calls, duration in seconds) for all unauthenticated API calls.
The API infrastructure should:
- Identify the client ID, if any, and client category of an API request
- Use that information to determine the rate limit
- Get the count of API calls so far based on the time period (e.g., how many API calls have been made this hour)
- If the count of API calls is less than the limit, execute the handler
- Otherwise, return HTTP status code 429 with rate limit headers like https://tools.ietf.org/html/draft-polli-ratelimit-headers-00
- When necessary to prevent pool exhaustion, the API infrastructure may introduce a delay in handling requests to manage the rate. For example, if there are only 60 requests left in a one-minute time period, the router might delay each request by 1 second. (This is an imperfect tool for public web requests, and there are lots of ways to exhaust a pool even if they're delayed).
Some API services count different API calls differently for API rate limiting; for example, an edit endpoint might count 2x or 5x more than a read endpoint. For the initial version, we'll count all API endpoints the same; 1 HTTP request = 1 API call.
Some API services use different API rate limits based on whether the request uses client-only authentication (like for a bot) or end-user authentication. For this version, we'll count all API calls for a client ID together, even if there are different user accounts associated.