Page MenuHomePhabricator

Ban clients of WDQS which don't follow throttling directives for some time
Closed, ResolvedPublic

Description

We regularly observe clients of Wikidata Query Service which go way over our throttling limits for a long time. For example, we currently have what looks like a bot, generating HTTP 429 at a rate of ~300/minute, clearly ignoring the rate limit and the "Retry-After" headers. While this is not a major problem (throttled requests are cheap), it is still a concern, since our throttling mechanism does not share state across the cluster. It allows such a bot to max out its throttling limit on each node.

One proposed approach would be to entirely ban such a user for a period of time, if it is obvious that the behaviour can be considered as abusive. For example, a bot generating more than 200 requests per minute during 1h would be banned for 24h.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 433001 had a related patch set uploaded (by Gehel; owner: Gehel):
[wikidata/query/rdf@master] Extract bucketing from throttler

https://gerrit.wikimedia.org/r/433001

Change 433002 had a related patch set uploaded (by Gehel; owner: Gehel):
[wikidata/query/rdf@master] Ban users which don't follow the throttling directives.

https://gerrit.wikimedia.org/r/433002

Change 433001 merged by jenkins-bot:
[wikidata/query/rdf@master] Extract bucketing from throttler

https://gerrit.wikimedia.org/r/433001

Change 433002 merged by jenkins-bot:
[wikidata/query/rdf@master] Ban users which don't follow the throttling directives.

https://gerrit.wikimedia.org/r/433002

Change 437187 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs: cleanup declarration of blazegraph options

https://gerrit.wikimedia.org/r/437187

Change 437188 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] wdqs: reduce ban to a minimum on the internal cluster

https://gerrit.wikimedia.org/r/437188

Change 437200 had a related patch set uploaded (by Gehel; owner: Gehel):
[wikidata/query/rdf@master] Introduce enable-ban-if-header to mirror enable-throttling-if-header

https://gerrit.wikimedia.org/r/437200

Change 437200 merged by jenkins-bot:
[wikidata/query/rdf@master] Introduce enable-ban-if-header to mirror enable-throttling-if-header

https://gerrit.wikimedia.org/r/437200

Change 437187 merged by Gehel:
[operations/puppet@production] wdqs: cleanup declarration of blazegraph options

https://gerrit.wikimedia.org/r/437187

Change 437188 merged by Gehel:
[operations/puppet@production] wdqs: reduce ban to a minimum on the internal cluster

https://gerrit.wikimedia.org/r/437188

Smalyshev triaged this task as Medium priority.
Vvjjkkii renamed this task from Ban clients of WDQS which don't follow throttling directives for some time to a0caaaaaaa.Jul 1 2018, 1:10 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Gehel as the assignee of this task.
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
CommunityTechBot renamed this task from a0caaaaaaa to Ban clients of WDQS which don't follow throttling directives for some time.Jul 2 2018, 3:36 PM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to Gehel.
CommunityTechBot lowered the priority of this task from High to Medium.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added subscribers: gerritbot, Aklapper.