Page MenuHomePhabricator

FY 25/26 WE 5.4.3: CDN (text) filtering rationalization
Closed, ResolvedPublic

Description

Right now, filtering of traffic at the edge follows two paths and two paths only:

  • if the request is in our web of trust, we skip all filtering
  • Otherwise, all filtering is applied

We want to move to a different model, where we have more classes of users, instead than just full trust / no trust:

  • We have at least 3 classes of users: trust / known / no trust where known can have various forms:
    • The request has a valid sesssion token (see WE 5.1.2)
    • The request comes from an IP/UA combination we consider "trusted"
    • The request includes other identification methods

So, we want to be able to apply requestctl rules in various points of the process, and to vary which rules we apply to what.

Moreover, we want to be able to detect if a request comes from something looking like a browser, and apply the stricter limits we declare in the Robot Policy for such requests.

In oder to do this we need to:

  • Add the ability for requestctl rules to be selected via some additional tags, or maybe just use specific alternative tags to "cache-text/cache-upload"
  • Change radically how we do filtering at the edge. An early design schematics of how it will look is something like:
    image.png (2×2 px, 525 KB)

Related Objects

StatusSubtypeAssignedTask
ResolvedJoe
ResolvedOttomata
ResolvedRequestKappakayala
OpenJoe
ResolvedJoe
OpenNone
ResolvedJoe
ResolvedJoe
ResolvedJoe
ResolvedTgr
OpenNone
ResolvedBUG REPORThashar
ResolvedBUG REPORT jnuche
ResolvedBUG REPORTSamwilson
ResolvedSLyngshede-WMF
ResolvedVgutierrez
OpenVgutierrez
ResolvedVgutierrez
Resolvedssingh
Resolvedssingh
ResolvedJoe
OpenNone
ResolvedBUG REPORTSLyngshede-WMF
ResolvedSLyngshede-WMF

Event Timeline

Aklapper renamed this task from WE 5.4.3 FY 25/26: CDN (text) filtering rationalization to FY 25/26 WE 5.4.3: CDN (text) filtering rationalization.Jun 30 2025, 7:52 AM
Aklapper added a project: SRE.
Joe triaged this task as High priority.

Quite a bit of the rationalization will depend upon the results of another hypothesis, the one about trusted bots. What we can however build while that's still being designed.

What should build for now is:

  • Allowlist approach both in varnish and haproxy. It can for now be based on x-provenance.
  • Add the ability (feature flag guarded) to validate JWTs (and maybe edge uniques?) in haproxy
  • Browser detection routines
  • Ability for the code to apply different sets of rules depending on allowlists
  • "Moat mode" rules - emergency rules we want to apply to all incoming traffic when we're under attack or in a traffic-induced outage. Ideally these rules will only remain active for 5 minutes, unless someone renews their activation

I'll open subtasks for all of the above.

Change #1178866 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] C:ip_reputation_vendors::datacenter_vendors: Known datacenter

https://gerrit.wikimedia.org/r/1178866

Change #1179136 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:cache::haproxy add ASN lookup function

https://gerrit.wikimedia.org/r/1179136

Change #1180711 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] varnish: add new requestctl file for deprecations

https://gerrit.wikimedia.org/r/1180711

Change #1180712 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] Add deprecations to varnish

https://gerrit.wikimedia.org/r/1180712

Change #1181090 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:puppetserver::volatile generate datacenter database

https://gerrit.wikimedia.org/r/1181090

Change #1182763 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:cache:? datacenter lookup

https://gerrit.wikimedia.org/r/1182763

Change #1182782 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:cache:haproxy add datacenter information to provenance

https://gerrit.wikimedia.org/r/1182782

Change #1181090 merged by Slyngshede:

[operations/puppet@production] P:puppetserver::volatile generate datacenter database

https://gerrit.wikimedia.org/r/1181090

Change #1183599 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:puppetserver::volatile fix group name

https://gerrit.wikimedia.org/r/1183599

Change #1183599 merged by Slyngshede:

[operations/puppet@production] P:puppetserver::volatile fix group name

https://gerrit.wikimedia.org/r/1183599

Change #1183612 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:puppetserver::volatile enable datacenter timer

https://gerrit.wikimedia.org/r/1183612

Change #1183612 merged by Slyngshede:

[operations/puppet@production] P:puppetserver::volatile enable datacenter timer

https://gerrit.wikimedia.org/r/1183612

Change #1184037 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:cache::haproxy copy datacenter.mmdb

https://gerrit.wikimedia.org/r/1184037

Change #1184497 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:cache::haproxy add datacenter information to provenance

https://gerrit.wikimedia.org/r/1184497

Change #1184037 merged by Slyngshede:

[operations/puppet@production] P:cache::haproxy copy datacenter.mmdb

https://gerrit.wikimedia.org/r/1184037

Change #1182763 merged by Slyngshede:

[operations/puppet@production] P:cache:haproxy add is_datacenter Lua action

https://gerrit.wikimedia.org/r/1182763

Change #1182782 merged by Slyngshede:

[operations/puppet@production] P:cache:haproxy add datacenter information to provenance

https://gerrit.wikimedia.org/r/1182782

Change #1179136 abandoned by Slyngshede:

[operations/puppet@production] P:cache::haproxy add ASN lookup function

Reason:

Replaced by https://gerrit.wikimedia.org/r/c/operations/puppet/+/1182782

https://gerrit.wikimedia.org/r/1179136

Change #1180711 merged by Giuseppe Lavagetto:

[operations/puppet@production] varnish: add new requestctl file for deprecations

https://gerrit.wikimedia.org/r/1180711

Change #1180712 merged by Giuseppe Lavagetto:

[operations/puppet@production] Add deprecations to varnish

https://gerrit.wikimedia.org/r/1180712

Change #1178866 abandoned by Slyngshede:

[operations/puppet@production] C:ip_reputation_vendors::datacenter_vendors: Known datacenters

Reason:

Done using mmdb instead.

https://gerrit.wikimedia.org/r/1178866

https://gerrit.wikimedia.org/r/c/operations/puppet/+/1180712 Seems to have broken varnish tests. Looking through seems to suggest this is because profile::cache::varnish::frontend::use_etcd_req_filters is set to "false" in hieradata/common/profile/cache/varnish/frontend.yaml, causing the test to be getting a 200 instead of a 403.

Change #1192846 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:cache::haproxy copy private repo data

https://gerrit.wikimedia.org/r/1192846

Change #1184497 abandoned by Slyngshede:

[operations/puppet@production] P:cache::haproxy add datacenter information to provenance

https://gerrit.wikimedia.org/r/1184497

Change #1192846 merged by Slyngshede:

[operations/puppet@production] P:cache::haproxy copy private repo data

https://gerrit.wikimedia.org/r/1192846

Change #1197231 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P::cache::haproxy enable x-is-browser everywhere

https://gerrit.wikimedia.org/r/1197231

Change #1197231 merged by Slyngshede:

[operations/puppet@production] P::cache::haproxy enable x-is-browser everywhere

https://gerrit.wikimedia.org/r/1197231

Should the broken tests as mentioned in T398161#11227347 be brought up in a new ticket?

Not resolved due to broken tests.

Change #1199792 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/puppet@production] haproxy,varnish: Report X-Is-Brower back from varnish

https://gerrit.wikimedia.org/r/1199792

Change #1199792 merged by Vgutierrez:

[operations/puppet@production] haproxy,varnish: Report X-Is-Browser back from varnish

https://gerrit.wikimedia.org/r/1199792