Page MenuHomePhabricator

Upgrade Envoy to v1.32.12
Closed, ResolvedPublic

Description

As of this writing we're still concluding T403663 for the 1.26 -> 1.29 bump, but in parallel we can start planning the next step, 1.29 -> 1.32.

Release notes of potential interest (1.30, 1.31, 1.32):

Config changes post-upgrade

  • (1.30.0) listener: deprecated runtime key overload.global_downstream_max_connections in favor of downstream connections monitor.

We noted this in the last upgrade but left it unchanged because, even though the old thing was deprecated, the new thing wasn't ready yet. The "work in progress" warning was removed from the downstream connections monitor docs as of 1.30, so we'll go ahead and switch to it.

(This wouldn't allow actually changing the value at runtime, at least until we can do so via xDS message, but I believe we aren't taking advantage of that anywhere.)

Tracing updates

Not counting the "New feature" entries, and not counting anything where I could determine it wouldn't cause any compatibility issues for us. The remaining items are to verify with tracing experts that we don't need to make any changes.

  • (1.32.0) tracers: Set status code based on gRPC status code for OpenTelemetry tracers (previously unset).
  • (1.31.0) tracers: Set status code for OpenTelemetry tracers (previously unset).
  • (1.31.0) tracing: Fix an issue where span id is missing from OpenTelemetry access log entries.
  • (1.30.0) tracers: use unary RPC calls for OpenTelemetry trace exports, rather than client-side streaming connections.

HTTP/1 and HTTP/2 parser changes

Probably no effect, especially since our Envoys receive no untrusted traffic, but documenting in case of edge-case behavior changes. (Note the net effect again is that oghttp2 is off by default, as it was previously.)

  • (1.32.0) http2: Changed the default value of envoy.reloadable_features.http2_use_oghttp2 to false. This changes the codec used for HTTP/2 requests and responses to address to address stability concerns. This behavior can be reverted by setting the feature to true.
  • (1.31.0) http2: Changes the default value of envoy.reloadable_features.http2_use_oghttp2 to true. This changes the codec used for HTTP/2 requests and responses. This behavior can be reverted by setting the feature to false.
  • (1.30.0) http: Enable obsolete line folding in BalsaParser (for behavior parity with http-parser, the previously used HTTP/1 parser).

This update also fixes the following security issues:

Fixed in 1.33.1 /1.32.4 /1.31.6 /1.30.10:

Envoy crashes when HTTP ext_proc processes local replies (CVE-2025-30157)
https://github.com/envoyproxy/envoy/security/advisories/GHSA-cf3q-gqg7-3fm9

Fixed in 1.34.1 / 1.33.3 / 1.32.6 / 1.31.8:

Bypass of RBAC uri_template permission (CVE-2025-46821)
https://github.com/envoyproxy/envoy/security/advisories/GHSA-c7cm-838g-6g67

Fixed in 1.35.2 / 1.34.6 / 1.33.8 / 1.32.11:

Use after free in DNS cache (CVE-2025-54588)
https://github.com/envoyproxy/envoy/security/advisories/GHSA-g9vw-6pvx-7gmw

oAuth2 Filter Signout route will not clear cookies because of missing "secure;" flag (CVE-2025-55162)
https://github.com/envoyproxy/envoy/security/advisories/GHSA-95j4-hw7f-v2rh

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change #1191768 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/debs/envoyproxy@v1.32] Update to v1.32.12

https://gerrit.wikimedia.org/r/1191768

Change #1191768 merged by RLazarus:

[operations/debs/envoyproxy@v1.32] Update to v1.32.12

https://gerrit.wikimedia.org/r/1191768

Mentioned in SAL (#wikimedia-operations) [2025-09-26T22:41:29Z] <rzl> rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy/envoyproxy_1.32.12-1_amd64.changes # T405808

Change #1191772 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/docker-images/production-images@master] envoy-future: Update to v1.32.12

https://gerrit.wikimedia.org/r/1191772

Change #1191772 merged by RLazarus:

[operations/docker-images/production-images@master] envoy-future: Update to v1.32.12

https://gerrit.wikimedia.org/r/1191772

Change #1191787 had a related patch set uploaded (by RLazarus; author: RLazarus):

[integration/config@master] helm-linter: Bump for envoy 1.32.12

https://gerrit.wikimedia.org/r/1191787

Change #1191788 had a related patch set uploaded (by RLazarus; author: RLazarus):

[integration/config@master] jjb: Update to helm-linter:0.7.5 to pick up envoy-future 1.32.12

https://gerrit.wikimedia.org/r/1191788

Change #1191787 merged by jenkins-bot:

[integration/config@master] Docker: [helm-linter]: Bump for envoy 1.32.12

https://gerrit.wikimedia.org/r/1191787

Mentioned in SAL (#wikimedia-releng) [2025-09-30T02:11:24Z] <James_F> Docker: [helm-linter]: Bump for envoy 1.32.12, for T405808

Change #1191788 merged by jenkins-bot:

[integration/config@master] jjb: [helm-lint] Update image to pick up envoy-future 1.32.12

https://gerrit.wikimedia.org/r/1191788

Testing this in mw-debug, there are two envoy warnings in the logs on startup:

[2025-10-27 21:58:03.859][1][warning][main] [source/server/server.cc:852] Usage of the deprecated runtime key overload.global_downstream_max_connections, consider switching to `envoy.resource_monitors.global_downstream_max_connections` instead.This runtime key will be removed in future.

That one's expected, and we'll follow up with a config change to use the new config setting instead of the old runtime setting. But this one was surprising:

[2025-10-27 21:58:03.790][1][warning][main] [source/server/server.cc:948] There is no configured limit to the number of allowed active downstream connections. Configure a limit in `envoy.resource_monitors.global_downstream_max_connections` resource monitor.

That makes it sound like the runtime key isn't just deprecated, it's already ignored. (In that case, we'd need to coordinate the config update with the binary update.)

But after some source-diving, it's not so: TcpListenerImpl::rejectCxOverGlobalLimit uses whichever value is set, and that warning is logged (misleadingly, IMO) even if the runtime key is still set. So we're still good to do our usual "binary, then config" process here.

Change #1199079 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/deployment-charts@master] mathoid: Upgrade to envoy-future:1.32.12 for validation

https://gerrit.wikimedia.org/r/1199079

Change #1199079 merged by jenkins-bot:

[operations/deployment-charts@master] mathoid: Upgrade to envoy-future:1.32.12 for validation

https://gerrit.wikimedia.org/r/1199079

Mentioned in SAL (#wikimedia-operations) [2025-10-27T23:03:18Z] <rzl> rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.32.12-1_amd64.deb # T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-27T23:03:34Z] <rzl> rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy # T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-27T23:03:42Z] <rzl> rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy # T405808

Change #1199082 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/docker-images/production-images@master] envoy: Update to v1.32.12

https://gerrit.wikimedia.org/r/1199082

Change #1199082 merged by RLazarus:

[operations/docker-images/production-images@master] envoy: Update to v1.32.12

https://gerrit.wikimedia.org/r/1199082

Change #1199085 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/deployment-charts@master] {api,rest}-gateway: Update to Envoy 1.32.12 in staging

https://gerrit.wikimedia.org/r/1199085

Change #1199085 merged by jenkins-bot:

[operations/deployment-charts@master] {api,rest}-gateway: Update to Envoy 1.32.12 in staging

https://gerrit.wikimedia.org/r/1199085

Change #1199519 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/deployment-charts@master] mw-*: Upgrade to Envoy 1.32.12 in the MW canary releases and mw-debug

https://gerrit.wikimedia.org/r/1199519

Change #1199519 merged by jenkins-bot:

[operations/deployment-charts@master] mw-*: Upgrade to Envoy 1.32.12 in the MW canary releases and mw-debug

https://gerrit.wikimedia.org/r/1199519

Mentioned in SAL (#wikimedia-operations) [2025-10-28T23:40:41Z] <rzl@deploy2002> Finished scap sync-world: https://gerrit.wikimedia.org/r/1199519 T405808 (duration: 03m 34s)

Mentioned in SAL (#wikimedia-operations) [2025-10-29T13:31:11Z] <moritzm> upgrade Envoy on debmonitor* T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-29T15:47:03Z] <mutante> upgrade Envoy on releases* T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-29T16:10:06Z] <mutante> upgrade Envoy on stewards* T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-29T16:11:56Z] <mutante> upgrade Envoy on etherpad* T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-29T17:36:48Z] <mutante> upgrade envoy on phab2002, vrts2002, contint2002 T405808

Mentioned in SAL (#wikimedia-operations) [2025-10-31T01:15:06Z] <mutante> upgraded envoyproxy on lists2001, aphlict1002, aphlict2001 T405808

Change #1201730 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/deployment-charts@master] {api,rest}-gateway: Update to Envoy 1.32.12 in production

https://gerrit.wikimedia.org/r/1201730

Change #1201731 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/deployment-charts@master] mw-*: Update to Envoy 1.32.12

https://gerrit.wikimedia.org/r/1201731

Change #1201732 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/deployment-charts@master] mw-videoscaler: Update to Envoy 1.32.12

https://gerrit.wikimedia.org/r/1201732

Change #1201730 merged by jenkins-bot:

[operations/deployment-charts@master] {api,rest}-gateway: Update to Envoy 1.32.12 in production

https://gerrit.wikimedia.org/r/1201730

Change #1201731 merged by jenkins-bot:

[operations/deployment-charts@master] mw-*: Update to Envoy 1.32.12

https://gerrit.wikimedia.org/r/1201731

Mentioned in SAL (#wikimedia-operations) [2025-11-04T22:19:31Z] <rzl@deploy2002> Finished scap sync-world: https://gerrit.wikimedia.org/r/1201731 T405808 (duration: 05m 39s)

Change #1201732 merged by jenkins-bot:

[operations/deployment-charts@master] mw-videoscaler: Update to Envoy 1.32.12

https://gerrit.wikimedia.org/r/1201732

Tracing updates section LGTM, no config changes needed. Thanks!

Change #1207289 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] kubernetes: Set default Envoy version to 1.32.12

https://gerrit.wikimedia.org/r/1207289

Change #1207289 merged by RLazarus:

[operations/puppet@production] kubernetes: Set default Envoy version to 1.32.12

https://gerrit.wikimedia.org/r/1207289

Mentioned in SAL (#wikimedia-operations) [2025-11-25T07:26:07Z] <moritzm> upgrade Envoy on puppet servers T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-25T11:28:42Z] <Emperor> depool / upgrade / restart envoy / repool on thanos frontends T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-25T11:50:31Z] <Emperor> depool / upgrade / restart envoy / repool on ms frontends T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-25T12:19:58Z] <Emperor> depool / upgrade / restart envoy / repool on Apus frontends T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-25T14:34:18Z] <moritzm> upgrade Envoy on webperfÜ T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-25T18:41:40Z] <urandom> upgrading envoyproxy to v1.32.12, restbase1031 & restbase2024— T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-25T19:07:17Z] <urandom> upgrading restbase cluster to envoyproxy v1.32.12 — T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:20:13Z] <denisse> Upgrading envoy on Grafana hosts - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:20:56Z] <denisse> Upgrading envoy on prometheus1005.eqiad.wmnet - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:23:06Z] <denisse> Upgrading envoy on prometheus hosts - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:24:18Z] <denisse> Upgrading envoy on prometheus::pop hosts - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:26:11Z] <denisse> Upgrading envoy on Graphite hosts - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:28:27Z] <denisse> Upgrading envoy on 'logstash1023.eqiad.wmnet' - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:30:42Z] <denisse> Upgrading envoy on logstash hosts - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:32:38Z] <denisse> Upgrading envoy on 'titan1001.eqiad.wmnet' - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T00:33:50Z] <denisse> Upgrading envoy on titan hosts - T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-26T06:42:09Z] <moritzm> upgrade Envoy on puppetboard* T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-27T09:21:22Z] <moritzm> upgrade Envoy on cloudweb* T405808

Mentioned in SAL (#wikimedia-operations) [2025-11-27T16:17:26Z] <moritzm> upgrade Envoy on chartmuseum* T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-01T08:50:33Z] <moritzm> upgrade Envoy on config-master* T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-01T10:47:48Z] <moritzm> upgrade Envoy on matomo1001 T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-01T11:29:04Z] <btullis> restarting envoyproxy process on cephosd100[1-5] for T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-01T13:42:16Z] <moritzm> upgrade Envoy on deployment servers T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-01T20:08:01Z] <mutante> upgrading envoyproxy on contint1002; phab1004; T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-02T09:38:24Z] <moritzm> upgrade Envoy on parsoidtest/testreduce T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-04T09:20:05Z] <arnoldokoth> upgrade envoyproxy on vrts T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-04T09:22:16Z] <arnoldokoth> upgrade envoyproxy on lists T405808

Mentioned in SAL (#wikimedia-operations) [2025-12-04T09:48:58Z] <moritzm> upgrade Envoy on an-launcher T405808