Page MenuHomePhabricator

Unexpected helmfile changes when attempting a k8s deployment for a miscweb site
Closed, ResolvedPublic

Description

While attempting to deploy some minor changes for security.wikimedia.org (T372570) I came across some unexpected helmfile changes on deploy1003. It looks like an envoy image was attempting to be changed for TLS proxying? I'm not sure if it's safe or advisable to deploy these to production miscweb sites. Here is the helmfile -e codfw diff --context 5 output:

helmfile.yaml: basePath=.
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-bugzilla-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-design-landing-page-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-research-landing-page-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-design-strategy-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-statictendril-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-wikiworkshop-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-annualreport-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-static-codereview-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-bienvenida-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-design-style-guide-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-design-blog-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-transparencyreport-codfw.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/miscweb/codfw.yaml"
skipping missing values file matching "values-codfw.yaml"
skipping missing values file matching "values-security-landing-page-codfw.yaml"
Comparing release=design-landing-page, chart=wmf-stable/miscweb
Comparing release=bugzilla, chart=wmf-stable/miscweb
Comparing release=research-landing-page, chart=wmf-stable/miscweb
Comparing release=design-strategy, chart=wmf-stable/miscweb
Comparing release=annualreport, chart=wmf-stable/miscweb
Comparing release=statictendril, chart=wmf-stable/miscweb
Comparing release=design-blog, chart=wmf-stable/miscweb
Comparing release=wikiworkshop, chart=wmf-stable/miscweb
Comparing release=design-style-guide, chart=wmf-stable/miscweb
Comparing release=transparencyreport, chart=wmf-stable/miscweb
Comparing release=security-landing-page, chart=wmf-stable/miscweb
Comparing release=bienvenida, chart=wmf-stable/miscweb
Comparing release=static-codereview, chart=wmf-stable/miscweb
miscweb, miscweb-design-blog, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-design-blog-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: design-blog
              - name: SERVICE_ZONE
...

miscweb, miscweb-design-landing-page, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-design-landing-page-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: design-landing-page
              - name: SERVICE_ZONE
...

miscweb, miscweb-static-codereview, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-static-codereview-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: static-codereview
              - name: SERVICE_ZONE
...

miscweb, miscweb-security-landing-page, Deployment (apps) has changed:
...
          envoyproxy.io/port: "9361"
      spec:
        containers:        
          # The main application container
          - name: miscweb-security-landing-page
-           image: "docker-registry.discovery.wmnet/repos/sre/miscweb/security-landing-page:2024-06-17-163318"
+           image: "docker-registry.discovery.wmnet/repos/sre/miscweb/security-landing-page:2024-08-16-095955"
            imagePullPolicy: IfNotPresent
            ports:
              - containerPort: 8080
            livenessProbe:
              tcpSocket:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-security-landing-page-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: security-landing-page
              - name: SERVICE_ZONE
...

miscweb, miscweb-design-style-guide, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-design-style-guide-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: design-style-guide
              - name: SERVICE_ZONE
...

miscweb, miscweb-statictendril, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-statictendril-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: statictendril
              - name: SERVICE_ZONE
...

miscweb, miscweb-annualreport, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-annualreport-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: annualreport
              - name: SERVICE_ZONE
...

miscweb, miscweb-transparencyreport, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-transparencyreport-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: transparencyreport
              - name: SERVICE_ZONE
...

miscweb, miscweb-bugzilla, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-bugzilla-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: bugzilla
              - name: SERVICE_ZONE
...

miscweb, miscweb-bienvenida, Deployment (apps) has changed:
...
                 - ALL
              runAsNonRoot: true
              seccompProfile:
                type: RuntimeDefault        
          - name: miscweb-bienvenida-tls-proxy
-           image: docker-registry.discovery.wmnet/envoy:1.23.10-2-s4-20231203
+           image: docker-registry.discovery.wmnet/envoy:1.23.10-3
            imagePullPolicy: IfNotPresent
            env:
              - name: SERVICE_NAME
                value: bienvenida
              - name: SERVICE_ZONE
...

helmfile.yaml: basePath=.

Event Timeline

I guess these seems fine per @elukey 's email to the ops list from July 8 pointing to T368366: Upgrade K8s docker images running in Wikimedia production on Buster to either Bullseye or Bookworm.

I'm unsure if deploy happens frequently enough across all our services to ensure that these are picked up in a timely way.

Just FYI: I applied all of the changes to staging, codfw and eqiad, but only deployed the one change I cared about to security-landing-page: https://sal.toolforge.org/log/4GppbJEBFFSCpsJzN8fC

Clement_Goubert subscribed.

Changes to sidecar images are generally fine to deploy, if in doubt you can ask on IRC either in #wikimedia-operations or #wikimedia-serviceops and someone should be able to answer. Thanks for deploying all of them :)

@sbassett Hi! Please subscribe to the ops mailing list so you can get notified by these changes, usually we post a message there to warn users. In this case it is safe to deploy since it is just a rebuild to use a new OS (Bookworm) and skip Buster.

Just FYI: I applied all of the changes to staging, codfw and eqiad, but only deployed the one change I cared about to security-landing-page: https://sal.toolforge.org/log/4GppbJEBFFSCpsJzN8fC

I didn't get if anything wasn't deployed for the security-landing-page, in case let me know and we can check together. After that I think we can close, lemme know :)

sbassett closed this task as Resolved.EditedAug 27 2024, 3:31 PM
sbassett claimed this task.
sbassett triaged this task as Low priority.

@sbassett Hi! Please subscribe to the ops mailing list so you can get notified by these changes, usually we post a message there to warn users. In this case it is safe to deploy since it is just a rebuild to use a new OS (Bookworm) and skip Buster.

Ok. I've been on ops-l for a while; I guess I just missed or didn't fully understand any relevant emails sent there. At least in the context of miscweb.

I didn't get if anything wasn't deployed for the security-landing-page, in case let me know and we can check together. After that I think we can close, lemme know :)

The changes I had made to security-landing-page were successfully deployed, yes. While I helm applied all of the changes to all of the other msicweb config yamls (the envoy image changes), I only actively deployed (via helmfile -e codfw/eqiad -i --selector apply) the changes for security-landing-page. Does that make sense?

Anyhow, I think this task can likely be resolved now.