Right now we're running three versions of Envoy in production: 1.15.4 and 1.15.5 (mixed across appservers and misc services hosts), and 1.18.3 (on cp hosts for T271421). On Kubernetes we're using 1.15.4 and 1.18.3.
None of these versions is currently supported; we should update to either 1.20.x or 1.21.x. (All else being equal, we might as well just get all the way up-to-date and use 1.21, but if there are feature/compatibility issues, or just a preference to use the version with more miles on it, 1.20 is still viable.)
Prereqs:
[x] Use v3 configuration API everywhere (done in https://gerrit.wikimedia.org/r/754460)
[ ] Check all the intermediate [[ https://www.envoyproxy.io/docs/envoy/latest/version_history/version_history | release notes ]] for any other compatibility issues in our config that need to be resolved before we begin
[ ] Choose a target version, 1.20.x or 1.21.x. I'll inquire with Traffic and with the API Gateway folks about any preference between the two.
Here's one way the rollout might go, exact plan still TBD:
[x] Update everything to 1.15.5, the current master version at [[ https://gerrit.wikimedia.org/g/operations/debs/envoyproxy | operations/debs/envoyproxy ]] - that is, clean up 1.15.4 first
[ ] Test 1.18.3 in the [[ https://gerrit.wikimedia.org/r/plugins/gitiles/integration/config/+/refs/heads/master/dockerfiles/helm-linter/ | helm-linter ]] image, to verify the config is compatible and check for deprecation warnings
[ ] Advance the master branch to 1.18.3 (the current envoy-future version)
[ ] Roll out 1.18.3 to all Envoy environments
[ ] Test the target version (TBD) in the helm-linter image
[ ] Advance the envoy-future branch to the target version
[ ] Roll out the target version to all environments that use envoy-future
[ ] Advance the master branch to the target version
[ ] Roll out the target version everywhere