Page MenuHomePhabricator

haproxy: strip x-wmf-* headers from responses
Closed, ResolvedPublic

Description

Headers using the prefix x-wmf-* are used internally be the API/REST gateway. They may be included in the response for analytics purposes, but should not be exposed to the client.

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Introduce x_ratelimit_class new variablerepos/sre/haproxykafka!105fabfurx-wmf-ratelimit-classmain
Customize query in GitLab

Event Timeline

Change #1243870 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] cache::haproxy: save x-wmf-ratelimit-class content for webrequest

https://gerrit.wikimedia.org/r/1243870

could I suggest using a dedicated prefix for API/REST gateway headers? We already use internally X-WMF as a prefix on other layers that aren't related to API/REST gateways.

Also in general I'm worried that if we establish that any backend can emit arbitrary x-wmf-* headers and the CDN will silently drop them, we create an incentive for services to overuse this namespace without coordination. That leads to header bloat on internal hops and makes it harder to reason about which headers are meaningful vs accidental.

I don't want us to make anything under x-wmf be stripped automatically. It limits ourselves in the future.

If you want to go with a prefix approach, for which I have similar doubts to what @Vgutierrez just expressed, I'd rather use something like x-internal- or similar so it's clear, even from the header name, that it should be cleaned and not exposed to the world.

But if we go this way, then we should adopt it for every header we use that has only internal usage, so we can simplify our edge cleanup code.

The gateway is adding x-wmf-ratelimit-class and x-wmf-ratelimit-policy and x-wmf-user-id to request headers it forwards to backend services. The primary purpose is internal to Envoy, but it seemed useful to have this information available in backend services. We could stop doing that if this is too much bloat.

It seemed like a good idea to use the same header names in both request and response, to avoid confusion.

Furthermore, using just x-ratelimit as a prefix would conflict with a different kind of ratelimit response headers which are used by Envoy internally (but we filter out).

We can change the prefix to x-internal or x-restgw or x-api for sure, we'd just have to update the envoy config and some index definitions for logstash. I don't think the current header names are referenced anywhere else.

We can also just strip the one header for now, and postpone the discussion...

Change #1247034 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] varnish: add headers to x-analytics

https://gerrit.wikimedia.org/r/1247034

Change #1243870 abandoned by Fabfur:

[operations/puppet@production] cache::haproxy: save x-ratelimit-class content for webrequest

Reason:

superseded by 1247034

https://gerrit.wikimedia.org/r/1243870

For now, let's remove the individual headers at the edge; we will have time to come up with a prefix naming (if any) that we want to use going forward, and to convert the headers to those values.

Change #1247034 merged by Fabfur:

[operations/puppet@production] varnish: add trusted_req and rl_class fields to x-analytics

https://gerrit.wikimedia.org/r/1247034

Raine moved this task from "Should I do it?" triage to Needs Info / Blocked on the User-Raine board.
Raine moved this task from Needs Info / Blocked to Radar on the User-Raine board.
Raine subscribed.
daniel claimed this task.

Done for x-wmf-ratelimit-class for now. Will revisit for a more general solution.