Page MenuHomePhabricator

System Adminstrator limits rate of MediaWiki REST API calls per client
Open, MediumPublic

Description

"As a Systems Administrator, I want to limit the rate of API calls per client developer, so I can allocate network and computing resources."

The System Administrator should be able to configure the following options:

  • categories of client ID (OAuth), on a per-installation basis, like "beginner", "trusted", "own", "partner"
  • API calling rate limits per category. A rate limit consists of a number of API calls (e.g. 100000) and a time duration in seconds (e.g. 86400 for one day). A rate limit of null means there is no limit.
  • A global rate limit (number of calls, duration in seconds) for all unauthenticated API calls.

The API infrastructure should:

  • Identify the client ID, if any, and client category of an API request
  • Use that information to determine the rate limit
  • Get the count of API calls so far based on the time period (e.g., how many API calls have been made this hour)
  • If the count of API calls is less than the limit, execute the handler
  • Otherwise, return HTTP status code 429 with rate limit headers like https://tools.ietf.org/html/draft-polli-ratelimit-headers-00
  • When necessary to prevent pool exhaustion, the API infrastructure may introduce a delay in handling requests to manage the rate. For example, if there are only 60 requests left in a one-minute time period, the router might delay each request by 1 second. (This is an imperfect tool for public web requests, and there are lots of ways to exhaust a pool even if they're delayed).

Some API services count different API calls differently for API rate limiting; for example, an edit endpoint might count 2x or 5x more than a read endpoint. For the initial version, we'll count all API endpoints the same; 1 HTTP request = 1 API call.

Some API services use different API rate limits based on whether the request uses client-only authentication (like for a bot) or end-user authentication. For this version, we'll count all API calls for a client ID together, even if there are different user accounts associated.

Event Timeline

EvanProdromou renamed this task from Rate limits to REST API Rate Limiting.Apr 17 2019, 12:02 AM
EvanProdromou created this task.
Anomie added a subscriber: Anomie.Apr 17 2019, 1:46 AM

This sounds like you're thinking of it from a very corporate mindset, versus MediaWiki's traditional rate limiting that generally works on a feature basis (e.g. edits per time) per user account rather than trying to limit a particular "client".

Tgr added a subscriber: Tgr.Apr 19 2019, 5:04 PM

MediaWiki rate limiting is based on user authentication, and often happens deep in the application (for example there are different rate limits for rendering a thumbnail in a standard size and a nonstandard one; the API does not know what are the standard sizes) and is only communicated to the controller in the form of error messages.

In theory there are use cases where user identity and client identity are not the same (more or less the things we use OAuth for). In practice I'm not sure those use cases have any overlap with the use cases where there's a need for rate limiting.

Tgr added a comment.EditedApr 24 2019, 8:46 PM

MediaWiki does lots of tricky rate / volume / timing management things: Throttler/User::pingLimiter, PoolCounter, ChronologyProtector, there are probably other things. At some point we should probably consider if/how the API framework needs to interacts with any of those.

eprodromou renamed this task from REST API Rate Limiting to System Adminstrator limits rate of API calls per client.Nov 6 2019, 7:22 PM
eprodromou triaged this task as Medium priority.
eprodromou updated the task description. (Show Details)
Anomie added a comment.Nov 6 2019, 9:16 PM

You should really look into the "prior art" before getting too deep in trying to define this from a technical perspective. Much of what you've written here seems similar to but not entirely compatible with the existing $wgRateLimits/User::pingLimiter(), with perhaps a dash of PoolCounter. If you were to give this task to a contractor, they'd likely wind up reinventing too much just as happened with the OAuth 2 project.

Also you're wanting to break the abstraction that OAuth is currently abstracted behind, wanting to have MediaWiki core unnecessarily depend on a specific extension, and by implication shoving the (probably vast majority of) requests not using OAuth 2 into the "unauthenticated" bucket. As I said in our meeting yesterday, conflating OAuth clients and "API keys" is probably not the best way to go about things.

Tgr renamed this task from System Adminstrator limits rate of API calls per client to System Adminstrator limits rate of MediaWiki REST API calls per client.Nov 7 2019, 12:32 AM