Page MenuHomePhabricator

ipblocks support for other "entities" (not clouds, not abuse nets)
Open, Needs TriagePublic

Description

One example of this: the Googlebot IP space, plus also, other Google-owned IP space that is neither Googlebot nor Google Cloud customers.

Related Objects

Event Timeline

Change 777899 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] external_clouds_vendors: Support entity types besides "cloud"

https://gerrit.wikimedia.org/r/777899

Ipblock, per se, supports arbitrary scope names.

What we need is to add support for thes other scopes in VCL.

My proposal would be to ditch the X-Public-Cloud header going forward, and add a X-SRE header (or whatever other name we want to give to it) and collect all these info as tags in it. So for example, a request from an IP in google cloud will have X-SRE: cloud=gcp as a fake-header, while a request coming from a known crawler IP space will have possibly a combined value, like X-SRE: cloud=aws;crawler=wikiband.

This has the disadvantage of making our VCL clunkier (every new category will need the same piece of code handling appending values to a header) but is future-proof and gives us a lot more flexibility.

I suggest something simpler:

Use a common prefix in the header name, with the name of the ipblock group as the suffix.

X-SRE-Ipblock-Cloud
X-SRE-Ipblock-CrawlerEntity

etc etc

Change 777899 merged by RLazarus:

[operations/puppet@production] external_clouds_vendors: Support entity types besides "cloud"

https://gerrit.wikimedia.org/r/777899

Change 779149 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] external_clouds_vendors: Remove migration shim for T305581

https://gerrit.wikimedia.org/r/779149

Change 779157 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] external_cloud_vendors: Add a known-clients/Googlebot ipblock

https://gerrit.wikimedia.org/r/779157

Change 779444 had a related patch set uploaded (by Volans; author: Volans):

[operations/puppet@production] cloud vendors: force yaml output format

https://gerrit.wikimedia.org/r/779444

Change 779149 merged by RLazarus:

[operations/puppet@production] external_clouds_vendors: Remove migration shim for T305581

https://gerrit.wikimedia.org/r/779149

Change 779444 merged by Volans:

[operations/puppet@production] cloud vendors: force yaml output format

https://gerrit.wikimedia.org/r/779444

Change 779157 merged by RLazarus:

[operations/puppet@production] external_cloud_vendors: Add a known-clients/Googlebot ipblock

https://gerrit.wikimedia.org/r/779157

@RLazarus were you going to work on the rest of this? We still need more plumbing inside requestctl correct?

Yeah -- I can do the implementation but I'm not sure if we've settled on what we want it to look like.

I don't have a strong opinion between the two header proposals above, did you and @Joe settle on a preference?

Change 784761 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] varnish: Rename public_clouds.json to ipblock_cloud.json

https://gerrit.wikimedia.org/r/784761

Change 784774 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] varnish: Allow using netmapper with multiple requestctl ipblocks

https://gerrit.wikimedia.org/r/784774

Change 784798 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] cache: Support multiple requestctl ipblock types in netmapper confd template

https://gerrit.wikimedia.org/r/784798