Deploy rate limiting in shadow mode (dry run) for the rest gateway for collecting stats.
- Phase 0: prepare infrastructure
- enabled: true: add ratelimit service and dependencies
- shadow_mode: false: (default)
- default_policy: experiment-2025 (default)
- user_id_cookie: "" (default)
- fallback_class: no-limit: avoid accidental enforcment of rate limits
- require_opt_in: true: disable rate limiting per default
- confirm that the ratelimit and nutcracker containers have been added to the gateway pod
- use curl to confirm that rate limiting does not apply on any route
- use envoy's admin interface to check that no requests are being made to the ratelimit service
- Phase 1: enable manual testing of limit enforcement
- allow_client_headers: true to enable manual testing
- enable_x_ratelimit_headers: true to enable x-ratelimit headers in the response (T408839)
- apply_rate_limiting: true on some routes, to activate rate limiting
- editor-analytics: /api/rest_v1/metrics/editors/ and /api/rest_v1/registered-users/ (~ 1 req/sec)
- wikifeeds: /api/rest_v1/page/random/ and /api/rest_v1/page/feed/ (~200 req/sec, top user > 12k req/hour)
- Use curl to test that no rate limits are applied on /api/rest_v1/metrics/editors/ and /api/rest_v1/page/random/
- set the User-Agent header to something useful that points to this ticket.
- test anon_limit with 500 req/hour is not enforced
- check that there are no x-ratelimit headers in the response
- Use curl to test that rate limits are applied if the x-wmf-user-id and x-wmf-user-class headers are set
- test anon_limit with 500 req/hour is enforced for x-wmf-user-class: anon
- test default_limit with 5000 req/hour is enforced for x-wmf-user-class: cookie-user
- check the values of the x-ratelimit headers in the response
- Phase 2: test shadow mode
- change shadow_mode: true: to enable global shadow mode
- use curl to check that rare limits are no longer enforced on /api/rest_v1/metrics/editors/and /api/rest_v1/page/random/
- confirm that we are still getting x-ratelimit headers in the response
- Phase 3: enable shadow mode limits for all users on certain routes
- user_id_cookie: "centralauth_User" so rate limiting is per user name (insecure)
- fallback_class: anon so rate limits are enforced for unauthenticated users
- allow_client_headers: false to prevent clients from overriding limits
- use curl to check that rare limits are not enforced on any route
- use envoy's admin interface to confirm that requests are being made to the rate limiter
- monitor ratelimiter metrics (T408183), confirm that we are seeing the "over limit" count go up (expect >10,000 per hour from the top user of /api/rest_v1/page/random/, compare Turnilo data)
- Phase 4: enable shadow mode limits on all routes
- require_opt_in: false: to turn on rate limiting for everything
- flip apply_rate_limiting to false on editor-analytics and wikifeeds routes.
- monitor redis resource consumption
- monitor ratelimit metrics
- confirm that the opt-out works and no rate limits are applied on wikifeeds routes (check headers).
- monitor redis resource consumption (eqiad/codfw). Should level off after one hour.
- Phase 5: disable x-ratelimit headers and remove opt-out.
- enable_x_ratelimit_headers: false: to disable rate limit headers
- remove apply_rate_limiting from all routes
- confirm that we are no longer sending x-ratelimit headers