Page MenuHomePhabricator

Rate limit requests in violation of User-Agent policy more aggressively
Open, HighPublic

Description

Wikimedia's User-Agent policy specifically forbids using generic values for the User-Agent request header.

Apply stricter rate limiting to requests violating the policy.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Jun 3 2019, 3:04 PM

Change 514017 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] cache_upload: return HTTP 403 to requests violating UA policy

https://gerrit.wikimedia.org/r/514017

Change 514017 merged by Ema:
[operations/puppet@production] cache_upload: return HTTP 403 to requests violating UA policy

https://gerrit.wikimedia.org/r/514017

For Tech News: Bots and other scripts that do not set an identifiable User-Agent may find their requests blocked until they identify themselves properly.

Not sure if it applies here, but please remember that we allow Api-User-Agent as an alternative to User-Agent for Javascript solutions.

ema renamed this task from Return HTTP 403 to requests in violation of User-Agent policy to Rate limit requests in violation of User-Agent policy more aggressively.Jun 5 2019, 2:48 PM
ema updated the task description. (Show Details)

We (Traffic) have decided to continue allowing requests violating the UA policy. Instead of blocking them, we will apply stricter rate limiting to those.

Change 513596 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] varnish: cache_upload rate limit

https://gerrit.wikimedia.org/r/513596

Change 513596 merged by Ema:
[operations/puppet@production] varnish: cache_upload miss/pass rate limit

https://gerrit.wikimedia.org/r/513596

TechNews: I've added it to the upcoming edition with this edit, that will be frozen for translation in about 18 hours. Please amend it before then if needed. (And thank you @Legoktm for writing the initial version!). Cheers!

Even with the current rate limiting, some crawling are regularly causing issues, wasting precious SRE time.

I'd like to revisit this task to be more strict on user agents, maybe progressively increasing the way we enforce our policy. For example:

  • Keep rate limiting for generic curl and other command line/testing tools
  • Forbid generic scripting UAs (eg. python-requests, empty) from cloud providers
  • Ideally later on, forbid generic scripting UAs from the whole Internet (except WMCS)

A variant could be to only apply the above on the upload cluster, but the less exceptions the better

Even with the current rate limiting, some crawling are regularly causing issues, wasting precious SRE time.

I'd like to revisit this task to be more strict on user agents, maybe progressively increasing the way we enforce our policy. For example:

  • Keep rate limiting for generic curl and other command line/testing tools
  • Forbid generic scripting UAs (eg. python-requests, empty) from cloud providers
  • Ideally later on, forbid generic scripting UAs from the whole Internet (except WMCS)

Agreed to all that, though I would not exempt WMCS because WMCS can generate significant amounts of traffic much faster by virtue of already being in the cluster and people using WMCS are generally Wikimedians who should be more familiar with our policies than someone who just wants to scrape wiki pages.

I would also add that after a DoS ~2 months ago I spent a while working on advertising the UA policy and our general API usage guidelines: [1], [2].

We responded to another set of pages today and most of the offending requests were coming from a public Cloud with no User-agent, so we've banned those requests from the upload cluster: https://gerrit.wikimedia.org/r/702003

I'm not really sure who or which team needs to approve this or whether no one opposes it and someone just needs to do it.

ayounsi raised the priority of this task from Medium to High.Jul 1 2021, 7:45 AM

Change 702896 had a related patch set uploaded (by Ema; author: Ema):

[operations/puppet@production] varnish: use 403 instead of 429 where appropriate

https://gerrit.wikimedia.org/r/702896

Change 702896 merged by Ema:

[operations/puppet@production] varnish: use 403 instead of 429 where appropriate

https://gerrit.wikimedia.org/r/702896

BBlack subscribed.

The swap of Traffic for Traffic-Icebox in this ticket's set of tags was based on a bulk action for all tickets that aren't are neither part of our current planned work nor clearly a recent, higher-priority emergent issue. This is simply one step in a larger task cleanup effort. Further triage of these tickets (and especially, organizing future potential project ideas from them into a new medium) will occur afterwards! For more detail, have a look at the extended explanation on the main page of Traffic-Icebox . Thank you!

Change 740818 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] R:varnish:instance: Add genral public cloud rate limiting

https://gerrit.wikimedia.org/r/740818

Change 740828 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] R:varnish:instance: Add hiere key to control cloud ratelimits

https://gerrit.wikimedia.org/r/740828

Change 740828 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] R:varnish:instance: Add hiere key to control cloud ratelimits

https://gerrit.wikimedia.org/r/740828

Change 740818 merged by Jbond:

[operations/puppet@production] R:varnish:instance: Add general public cloud rate limiting

https://gerrit.wikimedia.org/r/740818

Change 740828 abandoned by Jbond:

[operations/puppet@production] R:varnish:instance: Add hiera key to control cloud ratelimits

Reason:

replaced by requestctl

https://gerrit.wikimedia.org/r/740828

@Pppery AFAIK other then blocking empty agent headers on upload (T224891#7182766) no further progress has been made to addresses the comments in T224891#6983370