Page MenuHomePhabricator

envoy overwrites the server header
Closed, ResolvedPublic

Description

envoy is overwriting the server header in some scenarios:

(eqsin) $ curl -v https://en.wikipedia.org/api/rest_v1/page/summary/Tremont_Street_Subway 2>&1 |grep server:
< server: envoy
(codfw) $ curl --resolve en.wikipedia.org:443:208.80.153.224 -v https://en.wikipedia.org/api/rest_v1/page/summary/Tremont_Street_Subway 2>&1 |grep server:
< server: restbase2014

Event Timeline

Vgutierrez triaged this task as Medium priority.Nov 12 2019, 9:50 AM

Change 550436 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] envoyproxy: Avoid overwriting existing server header

https://gerrit.wikimedia.org/r/550436

Change 550436 merged by Vgutierrez:
[operations/puppet@production] envoyproxy: Avoid overwriting existing server header

https://gerrit.wikimedia.org/r/550436

Vgutierrez claimed this task.

Fixed with 2428105:

(eqsin) $ curl -v https://en.wikipedia.org/api/rest_v1/page/summary/Tremont_Street_Subway 2>&1 |grep server:
< server: restbase1017
mobrovac added a project: RESTBase.
mobrovac subscribed.

This doesn't seem to be working as expected. On the client, I always get Server: envoy:

$ curl https://test.wikipedia.org/api/rest_v1/page/html/Testparsoidphp -v
< HTTP/2 200
< server: envoy

Ok, a bit of digging:

# From the public internet

$ for dc in eqiad codfw esams eqsin ulsfo; do echo -n "$dc: "; curl --resolve test.wikipedia.org:443:$(dig +short text-lb.$dc.wikimedia.org) "https://test.wikipedia.org/api/rest_v1/page/html/Testparsoidphp" -v 2>&1 | grep -F server:; done
eqiad: < server: restbase1018
codfw: < server: restbase2012
esams: < server: envoy
eqsin: < server: envoy
ulsfo: < server: envoy

So this seems to be mysteriously tied to having an ATS-BE making the request to restbase.

Not very misteriously, the edges use ATS-BE so they call envoy, while the main dcs are still contacting restbase directly. Meh.

And indeed it seems things are not working as expected:

restbase2015:~$ curl restbase2015:7231/de.wikipedia.org/v1/page/references/Der_Junge_mit_dem_gro%C3%9Fen_schwarzen_Hund -Is | grep -F server:
server: restbase2015
restbase2015:~$ curl -k https://restbase2015:7443/de.wikipedia.org/v1/page/references/Der_Junge_mit_dem_gro%C3%9Fen_schwarzen_Hund -Is | grep -F server:
server: envoy

this is because the configuration directive we're trying to use was included in 1.12.0 that we're still not using.

See the docs about the server header here:
https://www.envoyproxy.io/docs/envoy/v1.11.2/configuration/http_conn_man/headers#server

Mentioned in SAL (#wikimedia-operations) [2019-11-22T14:49:48Z] <_joe_> disabling puppet on restbase2018, testing envoy upgrade T238050

Confirmed the upgrade fixes the Server: header output:

restbase2018:~$ curl -k https://restbase2018:7443/de.wikipedia.org/v1/page/references/Der_Junge_mit_dem_gro%C3%9Fen_schwarzen_Hund -Is | fgrep server:
server: restbase2018

@Vgutierrez I think you can just upgrade envoy across the fleet when you feel confident enough.

Mentioned in SAL (#wikimedia-operations) [2019-12-11T14:19:37Z] <rlazarus> updating envoyproxy to 1.12.2 on mwmaint, restbase T238050

Mentioned in SAL (#wikimedia-operations) [2019-12-11T14:43:26Z] <rlazarus> updating envoyproxy to 1.12.2 on all codfw T238050

Mentioned in SAL (#wikimedia-operations) [2019-12-11T14:45:01Z] <rlazarus> updating envoyproxy to 1.12.2 on all eqiad T238050

Change 556386 had a related patch set uploaded (by RLazarus; owner: RLazarus):
[operations/deployment-charts@master] blubberoid: Specify Envoy version 1.12.2-1

https://gerrit.wikimedia.org/r/556386

Change 556386 merged by jenkins-bot:
[operations/deployment-charts@master] blubberoid: Specify Envoy version 1.12.2-1

https://gerrit.wikimedia.org/r/556386

Joe reassigned this task from Vgutierrez to RLazarus.