⚓ T326340 Update staging-codfw to k8s 1.23

Subject	Repo	Branch	Lines +/-
admin_ng: Don't pin image version of coredns	operations/deployment-charts	master	+0 -1
staging-codfw: Unpin eventrouter, helm-state-metrics, coredns	operations/deployment-charts	master	+3 -0
Add istio config for main/wikikube clusters on k8s 1.23	operations/deployment-charts	master	+99 -0
cert-manager: Set leader election namespace to cert-manager	operations/deployment-charts	master	+4 -0
staging-codfw: Update coredns to 1.8.7-1	operations/deployment-charts	master	+3 -0
k8s: Update staging-codfw to kubernetes 1.23	operations/puppet	production	+47 -2
install_server: Update kubestagetcd2* to bullseye	operations/puppet	production	+3 -0

Status	Assigned	Task
Resolved	JMeybohm	T307943 Update Kubernetes clusters to v1.23
Resolved	JMeybohm	T326340 Update staging-codfw to k8s 1.23
Resolved	BTullis	T327799 Datahub errors in staging-codfw

JMeybohm created this task.Jan 5 2023, 5:48 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 5 2023, 5:48 PM

Change 877990 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] k8s: Update staging-codfw to kubernetes 1.23

https://gerrit.wikimedia.org/r/877990

gerritbot added a project: Patch-For-Review.Jan 10 2023, 10:42 AM

JMeybohm moved this task from Incoming 🐫 to Doing 😎 on the serviceops board.Jan 10 2023, 10:50 AM

JMeybohm updated the task description. (Show Details)

Icinga downtime and Alertmanager silence (ID=eff8a645-166c-412e-8f27-b7169d6aa830) set by jayme@cumin1001 for 1 day, 0:00:00 on 6 host(s) and their services with reason: Reinitialize staging-codfw with k8s 1.23

kubestage[2001-2002].codfw.wmnet,kubestagemaster2001.codfw.wmnet,kubestagetcd[2001-2003].codfw.wmnet

Cookbook cookbooks.sre.ganeti.reimage was started by jayme@cumin1001 for host kubestagetcd2001.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.ganeti.reimage started by jayme@cumin1001 for host kubestagetcd2001.codfw.wmnet with OS bullseye executed with errors:

kubestagetcd2001 (FAIL)
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- The reimage failed, see the cookbook logs for the details

Change 878047 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] install_server: Update kubestagetcd2* to bullseye

https://gerrit.wikimedia.org/r/878047

Change 878047 merged by JMeybohm:

[operations/puppet@production] install_server: Update kubestagetcd2* to bullseye

https://gerrit.wikimedia.org/r/878047

Change 877990 merged by JMeybohm:

[operations/puppet@production] k8s: Update staging-codfw to kubernetes 1.23

https://gerrit.wikimedia.org/r/877990

Maintenance_bot removed a project: Patch-For-Review.Jan 10 2023, 5:29 PM

Change 878184 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] staging-codfw: Update coredns to 1.8.7-1

https://gerrit.wikimedia.org/r/878184

gerritbot added a project: Patch-For-Review.Jan 10 2023, 7:47 PM

Change 878184 merged by jenkins-bot:

[operations/deployment-charts@master] staging-codfw: Update coredns to 1.8.7-1

https://gerrit.wikimedia.org/r/878184

Change 878190 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Add istio config for main/wikikube clusters on k8s 1.23

https://gerrit.wikimedia.org/r/878190

Change 878752 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cert-manager: Set leader election namespace to cert-manager

https://gerrit.wikimedia.org/r/878752

JMeybohm updated the task description. (Show Details)Jan 11 2023, 8:10 AM

JMeybohm updated the task description. (Show Details)

JMeybohm added a subscriber: elukey.

Change 878752 abandoned by JMeybohm:

[operations/deployment-charts@master] cert-manager: Set leader election namespace to cert-manager

Reason:

https://gerrit.wikimedia.org/r/878752

Change 878190 merged by jenkins-bot:

[operations/deployment-charts@master] Add istio config for main/wikikube clusters on k8s 1.23

https://gerrit.wikimedia.org/r/878190

Change 879063 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] staging-codfw: Unpin eventrouter, helm-state-metrics, coredns

https://gerrit.wikimedia.org/r/879063

Change 879063 merged by jenkins-bot:

[operations/deployment-charts@master] staging-codfw: Unpin eventrouter, helm-state-metrics, coredns

https://gerrit.wikimedia.org/r/879063

Change 879112 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Don't pin image version of coredns

https://gerrit.wikimedia.org/r/879112

Change 879112 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Don't pin image version of coredns

https://gerrit.wikimedia.org/r/879112

JMeybohm updated the task description. (Show Details)Jan 11 2023, 6:36 PM

JMeybohm updated the task description. (Show Details)

JMeybohm updated the task description. (Show Details)Jan 12 2023, 12:28 PM

JMeybohm updated the task description. (Show Details)Jan 12 2023, 4:11 PM

JMeybohm updated the task description. (Show Details)Jan 13 2023, 10:33 AM

JMeybohm updated the task description. (Show Details)Jan 13 2023, 10:40 AM

JMeybohm updated the task description. (Show Details)Jan 23 2023, 1:28 PM

Maintenance_bot removed a project: Patch-For-Review.Jan 23 2023, 1:30 PM

JMeybohm mentioned this in T307943: Update Kubernetes clusters to v1.23.Jan 23 2023, 3:52 PM

JMeybohm mentioned this in T327664: Update staging-eqiad to k8s 1.23.

elukey mentioned this in T327767: Upgrade the ml-staging-codfw cluster to k8s 1.23.Jan 24 2023, 11:15 AM

BTullis subscribed.Jan 24 2023, 5:18 PM

bking subscribed.Jan 24 2023, 5:19 PM

JMeybohm added a subtask: T327799: Datahub errors in staging-codfw.Jan 24 2023, 5:54 PM

JMeybohm added a subtask: T327786: Update toolhub helm chart to use the mcrouter helm chart module.Jan 24 2023, 6:07 PM

Jelto subscribed.Jan 25 2023, 2:13 PM

BTullis closed subtask T327799: Datahub errors in staging-codfw as Resolved.Jan 27 2023, 1:56 PM

Moved the rest of the open action items to T328291: Post Kubernetes v1.23 cleanup.
The "failed to update managedFields" I've not seen again. This seems to happen only once during the initial join of the node.

CDanis subscribed.Jan 31 2023, 4:29 PM

CDanis mentioned this in T329633: Upgrade the aux-eqiad cluster to k8s 1.23.Feb 14 2023, 3:23 PM

JMeybohm mentioned this in T329664: Update wikikube codfw to k8s 1.23.Feb 14 2023, 6:31 PM

JMeybohm updated the task description. (Show Details)Feb 17 2023, 12:48 PM

akosiaris mentioned this in T331126: Update wikikube eqiad to k8s 1.23.Mar 3 2023, 1:33 PM

JMeybohm mentioned this in T313871: kubernetes202[34] implementation tracking.Mar 3 2023, 1:53 PM

JMeybohm removed a subtask: T327786: Update toolhub helm chart to use the mcrouter helm chart module.Mar 7 2023, 10:24 AM