Page MenuHomePhabricator

I am hitting a rate limit on REST API endpoint
Open, Stalled, Needs TriagePublic

Description

I have reported this previously here: https://www.mediawiki.org/wiki/Topic:Wuw1yi6lcck2fujy

I am trying to get rendered HTML for a bunch of pages. I am pretty sure I am limiting myself to 200 requests per second to page/html API endpoint, and I have User-Agent set, but after few minutes I am starting to get 429 responses. Have I misunderstood something about the rate limit for that API endpoint? I am making those 200 requests in parallel though.

I tested with 100 requests per second as well. I managed to make 45852 requests in 8 minutes and after that they started failing with 429. This is a bit less than 100 requests per second.

Example response (for when using 200 requests per second limit on my side):

If you report this error to the Wikimedia System Administrators, please include the details below.

Request from IP via cp3056 cp3056, Varnish XID 528119086

Upstream caches: cp3056 int

Error: 429, Too Many Requests at Wed, 04 May 2022 16:48:01 GMT

I am sad that expected rate limit from hyperswitch does not get reported: https://github.com/wikimedia/hyperswitch/blob/2d4480149ae4d1969eec83063ddec05c9021574d/lib/filters/ratelimit_route.js#L32-L37

Event Timeline

The limit you're hitting is an intentional one, from this block in our edge code:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/varnish/templates/text-frontend.inc.vcl.erb#431

We generally try to limit excessive impact on our infrastructure from heavy use-cases like these, in order to preserve its functionality for the many human readers. There may be other ways to accomplish whatever it is you're trying to accomplish, including bulk downloads linked from https://dumps.wikimedia.org/ and/or the commercial API at https://enterprise.wikimedia.com/ . Or you can stick with what you're doing, but tune your request rate to stay within the bounds and avoid the 429s.

Hm, but documentation for REST API says I can use 200 requests per second? https://en.wikipedia.org/api/rest_v1/

Limit your clients to no more than 200 requests/s to this API. Each API endpoint's documentation may detail more specific usage limits.

So why I am hitting the rate limit if I am doing what is documented?

Sadly bulk downloads do not have HTML dumps, and Enterprise dumps do not offer them for template/module documentation (only articles, categories, and files). Also, there are no Enterprise dumps for Wikimedia Commons.

Hm, but documentation for REST API says I can use 200 requests per second? https://en.wikipedia.org/api/rest_v1/

Limit your clients to no more than 200 requests/s to this API. Each API endpoint's documentation may detail more specific usage limits.

So why I am hitting the rate limit if I am doing what is documented?

Because our edge traffic code enforces a stricter limit of ~100/s (for responses that aren't frontend cache hits due to popularity), before the requests ever get to the Restbase service.

Sadly bulk downloads do not have HTML dumps, and Enterprise dumps do not offer them for template/module documentation (only articles, categories, and files). Also, there are no Enterprise dumps for Wikimedia Commons.

Maybe if you describe what the overall scope/aim/intent of your project is, someone in another team or the community can help guide you towards a solution that will work better?

This comment was removed by Mitar.

Because our edge traffic code enforces a stricter limit of ~100/s (for responses that aren't frontend cache hits due to popularity), before the requests ever get to the Restbase service.

I understand now. But that documentation (or Varnish configuration) should be updated. I mean, me as a user of REST API I would just like to know what should I configure. In both places this is publicly documented it says 200 requests per second:

Do you agree that it should be updated then, it seems? Or documentation or Varnish configuration should be update for REST API. My understanding was that the whole point behind the REST API was because it is simpler API which can because of that have higher rate limit. That what it says on documentation:

Focused on high-volume use cases, it tightly integrates with Wikimedia's globally distributed caching infrastructure. As a result, API users benefit from reduced latencies and support for high request volumes.

So I am a bit surprised that REST API has the same rate limit as w/api.php?action=query which is my understanding to have "1 request should finish before the next request starts" rate limit policy. But looking at Varnish configuration it has the same rate limit as REST API (and not 20/s rate limit as what I would assume). Does this mean I can call w/api.php?action=query also with 100 requests per second?

Maybe if you describe what the overall scope/aim/intent of your project is, someone in another team or the community can help guide you towards a solution that will work better?

No worries. Rate limits available are enough for me and my use case. I am primarily reporting this here to report an issue with documentation or Varnish configuration.

Because our edge traffic code enforces a stricter limit of ~100/s (for responses that aren't frontend cache hits due to popularity), before the requests ever get to the Restbase service.

I understand now. But that documentation (or Varnish configuration) should be updated. I mean, me as a user of REST API I would just like to know what should I configure. In both places this is publicly documented it says 200 requests per second:

Do you agree that it should be updated then, it seems? Or documentation or Varnish configuration should be update for REST API. My understanding was that the whole point behind the REST API was because it is simpler API which can because of that have higher rate limit. That what it says on documentation:

Focused on high-volume use cases, it tightly integrates with Wikimedia's globally distributed caching infrastructure. As a result, API users benefit from reduced latencies and support for high request volumes.

My take is that the 200 req/s is aspirational and comes with a big asterisk that the real rate limit is constantly fluid and not really in the hands of the operators of the REST API. Rate limits can be applied to misbehaving user-agents, IP ranges, cloud providers, etc. depending on current situations, attacks or resources. Most of that is controlled by the SRE team at a level in front of the REST API, since the frontend caching layer is a shared resource across everything.

Also in my experience some of the specific REST API endpoints like /transform/ ones seem to have lower rate limits that aren't clearly documented, so really any concurrent client needs to be able to handle 429s and backoff appropriately.

I'm a bit hesitant to boldly add this onto the wiki page just because sections like "Terms and conditions" seem very legalish. But I do agree that this could be better documented.

So I am a bit surprised that REST API has the same rate limit as w/api.php?action=query which is my understanding to have "1 request should finish before the next request starts" rate limit policy. But looking at Varnish configuration it has the same rate limit as REST API (and not 20/s rate limit as what I would assume). Does this mean I can call w/api.php?action=query also with 100 requests per second?

The problem (also a feature) with the Action API (aka api.php) is that depending on specified query parameters your request could be incredibly cheap or very expensive. E.g. action=query&titles=Foobar is cheap, action=query&titles=Foobar&prop=revision&rvprop=content is expensive. The recommendation is to make requests in series rather than in parallel will all but guarantee that you're not going too fast and causing an overload. Because the Action API supports batching, there's not much advantage to running requests in parallel. But e.g. a web tool that makes individual requests based on user input/action might end up making multiple different requests at the same time, and that's generally permitted.

Most of that is controlled by the SRE team at a level in front of the REST API, since the frontend caching layer is a shared resource across everything.

I think it would be great if then no mention of rate limits are on REST API documentation page or elsewhere, but that there is just one wiki page maintained by SRE team which lists all current rate limits for different endpoints. You could even explain all that above and link to the configuration file directly. I think all users of APIs should be able to read that configuration file even if they are not familiar with Varnish (especially because it has nice comments).

I do think that in this particular case liming below 200 requests/s goes against the hopes of what the REST API would enable (especially as legalish "Terms and conditions" section also mentions that). So I do wonder if this lower rate limiting for REST API endpoint is really necessary. It seems there was quite some work done to be able to handle 200 requests/s.

Also in my experience some of the specific REST API endpoints like /transform/ ones seem to have lower rate limits that aren't clearly documented

It is clearly documented. On the top of REST API documentation it says:

Each API endpoint's documentation may detail more specific usage limits.

And then on /transform/ API endpoint documentation it says:

Rate limit: 25 req/s

I find this clear. :-)

so really any concurrent client needs to be able to handle 429s and backoff appropriately.

Talking about handling 429s. The problem is that it seems 429 errors do not provide any headers with information about the rate limit in effect nor a header like Retry-After header. So it is hard to have automatic handling of 429s.

But e.g. a web tool that makes individual requests based on user input/action might end up making multiple different requests at the same time, and that's generally permitted.

Sure, I just find it surprising that you are limiting a potentially very expensive API endpoint at the same rate as designed-to-be-cheap-and-cacheable REST API endpoint. Does this mean that REST API aspirations didn't turn out to be true?

Just to clarify, I am fine with current rate limits personally (of course more is better - I can process data 2x faster with 200 requests/s instead of 100 requests/s). But I just find current rate limits surprising giving what I know about API endpoints (based on what I read around different places). Given that I believe I have read probably everything there was around rate limits on various places in documentation I really thought I understand what I can do and I cannot do and so it looks to me like there is some disconnect between various stakeholders here (SRE vs. RESTbase team vs. general API exposure of data on Wikipedia).

Marostegui triaged this task as Medium priority.May 17 2022, 8:19 AM

Hm, I am pretty sure that I am doing rate limiting correctly on my side, but I am hitting 429s after a brief time when trying to do 1000/10s rate limit to the REST API endpoint. If I lower it to 500/10s then I do not hit 429s. No idea why, but I am doing many requests in parallel.

BCornwall changed the task status from Open to Stalled.Mar 30 2023, 9:04 PM
BCornwall raised the priority of this task from Medium to Needs Triage.
BCornwall removed a project: SRE.
BCornwall moved this task from Scheduled incidental work to Backlog on the Traffic board.

Hi, @Mitar! This ticket is pretty old at this point.... sorry! Hopefully you can understand that there are many tickets for the few of us to handle at once and this seems to have fallen in the cracks.

Would you be kind enough to let us know whether this is still and issue/still relevant today?

Thank you!

To my knowledge it is. https://www.mediawiki.org/wiki/Wikimedia_REST_API#Terms_and_conditions still says that 200 requests/second per REST API endpoint is fine (unless documented to have less, for example transform API endpoint) and configuration says differently: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/varnish/templates/text-frontend.inc.vcl.erb#431.

So I think there is still disconnect between documentation and internal configuration. Also, that regular API endpoints have the same limits to REST API endpoints is still an issue I think (given that the whole idea of REST API endpoints was that they would be less expensive).

Based on @BBlack's previous comment, the 100/s is intentional. I would guess that running parallel requests during the 1000/10s would hit the limit since each second might vary (second 1 may be 95 requests but second 2 might be 105).

I'm assuming that there isn't an issue with the actual rate-limiting. That would mean that this ticket can boil down to two things:

  1. Updating the documentation to be more clear about the limits imposed
  1. A request to raise the limit for the REST API endpoint since it produces less load than the regular API endpoint

Does that accurately reflect the current state of this ticket?

Yes, great summary. Thanks.