Page MenuHomePhabricator

x-provenance header: identify WMCS
Open, Needs TriagePublic

Description

For metrics and rate limiting purposes, it would be useful to have a streight forward way to disinguish traffic coming from WMCS from other internal traffic. It seems like currently it might be tagged with net=wikimedia-trust or net=internal. It would be nice to have net=wmcs. Or maybe cloud=wmcs?

Ideally, we could further distinguish between WMCS and some of the special projects on it like Toolforge and Beta.

Event Timeline

daniel added a subscriber: taavi.

For the record, I asked @taavi about including information about the tool or user in requests coming from WMCS. He said it's not possible because the tools use HTTPS, we can't mess with the encrypted traffic.

I don't think we currently have any places outside of https://wikitech.wikimedia.org/wiki/Help:Cloud_VPS_IP_space that publish our IP space. Would it be helpful if we published the same information in some machine-readable format?

I don't think we currently have any places outside of https://wikitech.wikimedia.org/wiki/Help:Cloud_VPS_IP_space that publish our IP space. Would it be helpful if we published the same information in some machine-readable format?

Probbly... maybe this could use the same mechanism we use for "know clients" like googlebot? Curious what @CDanis thinks.

I don't think we currently have any places outside of https://wikitech.wikimedia.org/wiki/Help:Cloud_VPS_IP_space that publish our IP space. Would it be helpful if we published the same information in some machine-readable format?

Probbly... maybe this could use the same mechanism we use for "know clients" like googlebot? Curious what @CDanis thinks.

I think this would make sense on both counts.

We already support the Googlebot JSON format, which has become something of a de facto standard.

Both cloud IP ranges and known clients (identified bots) are maintained by fetch_external_clouds_vendors_nets.py in the puppet repo.

We already support the Googlebot JSON format, which has become something of a de facto standard.

Do you have a link to an example or spec?

Ok, we now publish https://meta.wmcloud.org/cloudvps-ips-all.json (which is now documented at https://wikitech.wikimedia.org/wiki/Help:Cloud_VPS_IP_space#Machine-readable_data). I can't quite tell from the comments here - would a similar file for the Toolforge workers be helpful as well?

Wow, that was quick, thank you!

I can't quite tell from the comments here - would a similar file for the Toolforge workers be helpful as well?

Yes, please!

Getting the IP ranges documented is a great first step -- thanks, @taavi ! @KCVelaga_WMF -- this step might be useful for your current API traffic analysis work, especially if anything differs from the values you're already using.

Just chiming in here -- the original intent of this request is very important to the overall API goals for rate limiting, improving observability, and ultimately decision making. Although the A category within x-trusted-request flag includes WMCS (as well as other wikimedia network traffic), it would indeed be valuable to better differentiate internal processes from true WMCS tools and services. Specifically knowing what is Toolforge vs VPS vs other internal traffic will also help us understand true utilization and what is actually happening.

From a longer term ownership and consistency perspective, it probably makes more sense for that classification to come from the edge as part of x-provenance instead of having the Gateway apply additional classifications using the IP ranges directly, though. For more context, we are interested in bringing the header categories into the data lake for additional and more scalable analysis, so it would be very nice if it was consistently categorized instead of relying on additional post-processing where logic may ultimately diverge between pipelines and data sources.

Is there anything y'all need from us that would help get this work prioritized? @JTweed-WMF can confirm, but I believe this will be a key dependency for the Q3 WE5.2 hypothesis work on rate limiting and improving API observability (ie: getting more data into web_requests and other data lake tables). We unfortunately don't have the specific hypothesis numbers quite yet, though.

Although the A category within x-trusted-request flag includes WMCS (as well as other wikimedia network traffic), it would indeed be valuable to better differentiate internal processes from true WMCS tools and services.

Ftr, the A category currently also includes Enterprise, but that can already be distinguished based on the net=wme tag in x-provenance.

Change #1217466 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] C:external_clouds_vendors add WMCS

https://gerrit.wikimedia.org/r/1217466