4 simultaneous connections per client (IP + User-Agent) should be good.
Relatedly, this is done in nginx for dump downloads. See https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/dumps/templates/nginx.dumps.conf.erb
4 simultaneous connections per client (IP + User-Agent) should be good.
Relatedly, this is done in nginx for dump downloads. See https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/dumps/templates/nginx.dumps.conf.erb
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Ladsgroup | T148997 Implement parallel connection limit for querying ORES | |||
Resolved | Ladsgroup | T160692 Use poolcounter to limit number of connections to ores uwsgi | |||
Resolved | Ladsgroup | T201823 Implement PoolCounter support in ORES | |||
Resolved | akosiaris | T203465 Site: 4 VM request for ORES poolcounter | |||
Resolved | akosiaris | T201824 Spin up a new poolcounter node for ores | |||
Resolved | Ladsgroup | T201825 Test poolcounter support for ores in beta cluster | |||
Resolved | Ladsgroup | T201826 Implement support for whitelisting and proxy requests for poolcounter in ORES | |||
Declined | Ladsgroup | T204897 Add Wiki Education Dashboard and Programs & Events Dashboard to ORES connection whitelist | |||
Resolved | Tgr | T161029 Forward request data in proxied Action API modules |
Presumably internal IPs should be exempt from this, and the API should set XFF headers when proxying requests?
Per T137962#2447946, "Generic ratelimiting (e.g. per client IP) and other similar protection measures for these clusters has been pushed off for post-varnish4" so that's not an option right now.
If we rely on some naive frontend implementation (e.g. ngx_http_limit_conn_module) on the web worker or MW API nodes then we end up with an N connections per node limit instead of a global one. Are IPs assigned to nodes via some deterministic hashing, or just round-robin? In the first case, what would happen when nodes are added / removed? (ie. are we using consistent hashing?)
Maybe it could be done in the ORES load balancer (if it's based on proxying and not DNS lookups). That wouldn't be a proper parallel connection limit but a limit on number of connections initiated per time unit, but under normal operation ORES does not have long-lived connections so that should be close enough.
Alternatively the throttling could be implemented in the app code, using some shared resource such as Redis. That's a probably a performance hit, even if a very small one, so it's doable but less ideal than using some existing varnish/nginx/whatever functionality.