Page MenuHomePhabricator

Proposal: add a per-service rate limit setting to API Gateway
Closed, ResolvedPublic

Description

Hi everybody,
not sure if there is already a task about this, I quickly checked and didn't find it, in case please close it as duplicate :)

If my understanding is correct, the current rate-limit settings for API-Gateway is around 500 requests/hour for anonymous users and 5000 for logged in users, applied globally for all services. I am wondering if we could add the possibility to have this rate-limit per service, so backend owners can decide the best values for their services without trying to come up with a compromise with other teams.

Another very nice feature would be to have a way to apply rate limits to a specific combination of client metadata, like UA and IP. The use case that I am thinking of is if a bot or a specific user generates too much traffic and a backend service owner wants to act on it without impacting other regular users (not impacting the service with their request flows).

The ML team is more than happy to help in the development of these features if you feel that they are sound and consistent with the current API-Gateway's plans.

Event Timeline

No problem with this (if there's an usecase here), but note that individual clients with an acceptable need for higher rate limit may be promoted to higher ratelimit tiers, which is a simpler measure that might be used before this gets implemented :-). According to my current understanding, there are three rate limits currently:

  • Default (5000 requests/hour)
  • Preferred (25,000 requests/hour)
  • Internal (100,000 requests/hour)

That being said, I'm not sure if the Machine-Learning-Team usecase is only for a few specific clients (so promoting them to higher tiers can make sense), or if the usecase is that almost all consumers of your API need higher limits than 5k reqs/hour.

Hi @Urbanecm! Thanks for the link, very interesting, I didn't know it.

My understanding of the API-Gateway is still very high level, so I may have the wrong picture in my head. IIUC any limit/tier is related to a client making requests to any of the services behind API-Gateway, it doesn't really distinguish between use cases. For example, if we add the inference service to API-Gateway, a user can make requests to linkreccomendation, mw-api and inference regardless of the size and capabilities of their backends. I agree that we should have a high level global limit for the API-Gateway service itself (taking priority over the rest), but it would be nice to allow backend service owners to decide the user limits for their backends.

The tiers are great but IIUC they are about specific clients, there is nothing at the moment that would identify something like "ML traffic".

Change 741937 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: allow discovery services to set custom rate limits

https://gerrit.wikimedia.org/r/741937

Change 741937 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: allow discovery services to set custom rate limits

https://gerrit.wikimedia.org/r/741937

Change 764409 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: bump chart

https://gerrit.wikimedia.org/r/764409

Change 764409 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: bump chart

https://gerrit.wikimedia.org/r/764409

Change 767070 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: move route_name metadata to route level

https://gerrit.wikimedia.org/r/767070

One major issue with our previous approach to this problem was the use of the metadata_key field in config which assumes that the metadata pointed to by the label provided will be in the Envoy ratelimit format ("ratelimit": {"request_per_unit": 5000, "unit": "HOUR"}). This is fine for the JWT data as we quite literally encode envoy ratelimit headers in the data of the token. However, in the case of Envoy config we just want to configure a descriptor based on an attribute in metadata. I am not fully certain whether this is currently possible in Envoy - docs are fairly scant. I'm going to see if there's another approach we can take.

Change 809198 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: allow discovery services to set custom rate limits

https://gerrit.wikimedia.org/r/809198

Change 809198 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: allow discovery services to set custom rate limits

https://gerrit.wikimedia.org/r/809198

This has been implemented and deployed in production. We currently have no services requiring their own rate limit buckets but this can easily be configured for discovery services.

Change 767070 abandoned by Hnowlan:

[operations/deployment-charts@master] api-gateway: move route_name metadata to route level

Reason:

Incorrect fix, outdated

https://gerrit.wikimedia.org/r/767070

hnowlan claimed this task.