Page MenuHomePhabricator

Test api rate limiting on staging cluster
Closed, ResolvedPublic8 Estimated Story Points

Description

Deploy rate limiting for the rest gateway on the staging cluster for manual testing.

Steps:

  • Create a list of URLs to test against (using bash/curl, Siege, or Locust)
  • Enforce limits on all routes, but only if the client specifies the x-wmf-user-class header (or equivalent). Test that rate limit is enforced.
  • Set global shadow mode, test that rate limit is no longer enforced but logs/stats provide useful information
  • Enable rate limiting (in shadow mode) on routes that opt into rate limiting, regardless of the x-wmf-user-class being given (and ensure such headers are ignored if thy come from the client).

Stretch:

  • Test selectiv shadow mode [needs Envoy 1.33]

For the purpose of this test, the rate limits should be configured very log, e.g. 3 per minute for anon and 10 per minute for users.

Details

Due Date
Oct 10 2025, 10:00 PM
Other Assignee
Clement_Goubert

Event Timeline

daniel set the point value for this task to 8.
daniel set Due Date to Oct 10 2025, 10:00 PM.

Change #1194174 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] rest-gateway: Deploy rate limiting in staging

https://gerrit.wikimedia.org/r/1194174

The rate-limiting core patch is now merged, nothing is turned on anywhere but there's no issues with the changes that are currently made (lua, headers, etc.).
We're in pretty good shape to enable it in staging and start testing.

Change #1194174 merged by jenkins-bot:

[operations/deployment-charts@master] rest-gateway: Deploy rate limiting in staging

https://gerrit.wikimedia.org/r/1194174

Action items:

  • Get a better set of routes in staging, the ones in here now are not conducive to testing.
  • Fix no-csp disabling the lua completely

Change #1198310 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] api-gateway: Use metadata to flip csp header handling

https://gerrit.wikimedia.org/r/1198310

Change #1198953 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] api-gateway: csp header handling

https://gerrit.wikimedia.org/r/1198953

Change #1198310 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: Use metadata to flip csp header handling

https://gerrit.wikimedia.org/r/1198310

Change #1198953 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: csp header handling

https://gerrit.wikimedia.org/r/1198953

Change #1198987 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] rest-gateway: Fix csp_enabled configuration

https://gerrit.wikimedia.org/r/1198987

Change #1198987 merged by jenkins-bot:

[operations/deployment-charts@master] rest-gateway: Fix csp_enabled configuration

https://gerrit.wikimedia.org/r/1198987

Change #1199331 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] api-gateway: Release patch for ratelimit test

https://gerrit.wikimedia.org/r/1199331

daniel closed this task as Resolved.EditedOct 30 2025, 5:36 PM

Completes successfully with one caveat. Test plan and protocol at https://docs.google.com/document/d/1lWAPTQs5WwLdzxA621qkMyUggiNfgCpKi3TObEU6XA0/edit?pli=1&tab=t.3tvjcm43fthr#heading=h.ayqizepthr36

Caveat: Per-route rate limit configuration (needed e.g. for selective shadow mode) needs Envoy 1.33, it is broken in earlier versions. But we can still turn rate limiting off and on per route, and we have global shadow mode. That's sufficient for the initial rollout.

daniel updated the task description. (Show Details)