Page MenuHomePhabricator

Migrate Docker images running in Production away from Bullseye
Open, MediumPublic

Description

Bullseye EOL: August 2026

A lot of common images have already been migrated to Bookworm as part of the k8s 1.31 upgrade:

  • docker-registry.discovery.wmnet/cert-manager/cainjector:1.10.1-2
  • docker-registry.discovery.wmnet/cert-manager/controller:1.10.1-2
  • docker-registry.discovery.wmnet/cert-manager/webhook:1.10.1-2
  • docker-registry.discovery.wmnet/istio/pilot:1.15.7-2
  • docker-registry.discovery.wmnet/istio/proxyv2:1.15.7-2

The above are running in all clusters except Wikikube, that already migrated. Out of the scope for this task are also the MediaWiki Docker images, that need more work on the Service Ops side.

Wikikube

  • docker-registry.discovery.wmnet/haproxy:2.4.18-2-20240630
  • docker-registry.discovery.wmnet/wikimedia/generated-data-platform-aqs-device-analytics:2024-06-05-094107-production
  • docker-registry.discovery.wmnet/repos/mediawiki/services/kask:v1.0.12
  • docker-registry.discovery.wmnet/wikimedia/generated-data-platform-aqs-editor-analytics:2024-06-10-191953-production
  • docker-registry.discovery.wmnet/flink-kubernetes-operator:1.12.1-wmf0
  • docker-registry.discovery.wmnet/wikimedia/generated-data-platform-aqs-geo-analytics:2024-06-05-110455-production
  • docker-registry.discovery.wmnet/wikimedia/generated-data-platform-aqs-media-analytics:2024-06-10-192150-production
  • docker-registry.discovery.wmnet/rsyslog:8.2102.0-3
  • docker-registry.discovery.wmnet/repos/data-engineering/mediawiki-event-enrichment:v1.43.0
  • docker-registry.discovery.wmnet/otelcol:0.102.0-1
  • docker-registry.discovery.wmnet/wikimedia/generated-data-platform-aqs-page-analytics:2024-06-10-192500-production
  • docker-registry.discovery.wmnet/repos/search-platform/flink-rdf-streaming-updater:flink-1.17.1-rdf-0.3.154
  • docker-registry.discovery.wmnet/wikimedia/operations-software-tegola:2024-09-13-081439-publish
  • docker-registry.discovery.wmnet/wikimedia/wikimedia-toolhub:2025-02-19-214003-production
Aux

Nothing except the common pre-1.31 images above.

DSE
  • docker-registry.discovery.wmnet/repos/data-engineering/spark/kyuubi:1.10.2-2025-12-23-173037-ad4447d949a8fbb2e5283bfbe02e62483922c10d - analytics-test ns
  • docker-registry.discovery.wmnet/repos/data-engineering/spark:3.5.7-2025-12-23-173037-ad4447d949a8fbb2e5283bfbe02e62483922c10d - analytics-test ns
  • docker-registry.discovery.wmnet/flink-kubernetes-operator:1.12.1-wmf0 - flink-operator ns
  • docker-registry.discovery.wmnet/repos/data-engineering/mediawiki-event-enrichment:v1.43.0 - mw-content-history-reconcile-enrich
  • docker-registry.discovery.wmnet/repos/data-engineering/mediawiki-event-enrichment:v1.43.0 - mw-content-history-reconcile-enrich-next
  • docker-registry.discovery.wmnet/repos/data-engineering/spark/spark3.4-history@sha256:12710ef51b29dca966dc9a5e6405a28076ca609b4e13498f24816e2d6e8bb38d - spark-history ns
  • docker-registry.discovery.wmnet/repos/data-engineering/spark/spark3.4-history@sha256:12710ef51b29dca966dc9a5e6405a28076ca609b4e13498f24816e2d6e8bb38d - spark-history-test
  • docker-registry.discovery.wmnet/repos/data-engineering/superset/superset-backend@sha256:bd672d5aa86194d24eebc6836f3805f19f777160633842ee3410e59959b9ef8e - superset ns
  • docker-registry.discovery.wmnet/repos/data-engineering/superset/superset-backend@sha256:bd672d5aa86194d24eebc6836f3805f19f777160633842ee3410e59959b9ef8e - superset-next ns
ML

Most of the Kserve pods use docker-registry.discovery.wmnet/istio/proxyv2:1.15.7-2 as sidecar, that should go away with the Kubernetes 1.31 migration.

Event Timeline

Change #1236673 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/deployment-charts@master] services: upgrade thumbor's haproxy container to Bookworm and 2.8.3

https://gerrit.wikimedia.org/r/1236673

Change #1236673 merged by Elukey:

[operations/deployment-charts@master] services: upgrade thumbor's haproxy container to Bookworm and 2.8.3

https://gerrit.wikimedia.org/r/1236673

@CDanis Hi! I saw your name for otelcol and this is why I am reaching out :) IIUC it is a golang binary so it should be ok to create a component for trixie-wikimedia and copy over the otelcol-contrib package right? Or should we follow another process? I know there is a bookworm variant but I'd love to target stable if possible. Lemme know!

@CDanis Hi! I saw your name for otelcol and this is why I am reaching out :) IIUC it is a golang binary so it should be ok to create a component for trixie-wikimedia and copy over the otelcol-contrib package right? Or should we follow another process? I know there is a bookworm variant but I'd love to target stable if possible. Lemme know!

Exactly right. Targeting trixie sounds great, thanks!

The cert-manager 1.10.x images are probably okay to ignore since the remaining clusters will have to upgrade to k8s >= 1.31 anyways this year which requires an upgrade of cert-manager to 1.16.x which is using a bookworm base.

elukey triaged this task as Medium priority.Mon, Feb 9, 3:41 PM