Page MenuHomePhabricator

Increase vcpus on K8s control plane VMs
Closed, ResolvedPublic

Description

We have seen in the past pages related to kube-api getting restarted on control plane nodes for TLS cert renewal. There is an excellent analysis from Ben in T389720, we should make the specs consistent through the clusters.

  • ml-serve-ctrl - nproc: 2 => 4, memory: 4GB => 6GB
  • dse-k8s-ctrl
  • aux-k8s-ctrl - nproc: 1 => 4, memory: 4GB => 6GB

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2025-04-19T16:30:17Z] <elukey> sudo gnt-instance modify -B memory=6g,vcpus=4 aux-k8s-ctrl2003.codfw.wmnet - T392289

Mentioned in SAL (#wikimedia-operations) [2025-04-19T16:30:23Z] <elukey> sudo gnt-instance modify -B memory=6g,vcpus=4 aux-k8s-ctrl2002.codfw.wmnet - T392289

VM aux-k8s-ctrl2002.codfw.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

VM aux-k8s-ctrl2003.codfw.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

Mentioned in SAL (#wikimedia-operations) [2025-04-19T16:38:09Z] <elukey> sudo gnt-instance modify -B memory=6g,vcpus=4 aux-k8s-ctrl1003.eqiad.wmnet - T392289

Mentioned in SAL (#wikimedia-operations) [2025-04-19T16:38:15Z] <elukey> sudo gnt-instance modify -B memory=6g,vcpus=4 aux-k8s-ctrl1002.eqiad.wmnet - T392289

VM aux-k8s-ctrl1003.eqiad.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

VM aux-k8s-ctrl1002.eqiad.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

Mentioned in SAL (#wikimedia-operations) [2025-04-23T07:26:57Z] <elukey> elukey@ganeti2032:~$ sudo gnt-instance modify -B memory=6g,vcpus=4 ml-serve-ctrl2001.codfw.wmnet - T392289

Mentioned in SAL (#wikimedia-operations) [2025-04-23T07:27:02Z] <elukey> elukey@ganeti2032:~$ sudo gnt-instance modify -B memory=6g,vcpus=4 ml-serve-ctrl2002.codfw.wmnet - T392289

Mentioned in SAL (#wikimedia-operations) [2025-04-23T07:27:44Z] <elukey> elukey@ganeti1048:~$ sudo gnt-instance modify -B memory=6g,vcpus=4 ml-serve-ctrl1002.eqiad.wmnet - T392289

Mentioned in SAL (#wikimedia-operations) [2025-04-23T07:27:47Z] <elukey> elukey@ganeti1048:~$ sudo gnt-instance modify -B memory=6g,vcpus=4 ml-serve-ctrl1001.eqiad.wmnet - T392289

Mentioned in SAL (#wikimedia-operations) [2025-04-23T07:28:02Z] <elukey> reboot ml-serve-ctrl* VMs to pick up new cpu/memory settings - T392289

VM ml-serve-ctrl2002.codfw.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

VM ml-serve-ctrl2001.codfw.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

VM ml-serve-ctrl1001.eqiad.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

VM ml-serve-ctrl1002.eqiad.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

elukey claimed this task.
elukey updated the task description. (Show Details)

VM ml-staging-ctrl2002.codfw.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory

Mentioned in SAL (#wikimedia-operations) [2025-04-28T09:54:03Z] <elukey> increase vcores and memory available for ml-staging-ctrl2* - T392289#10771944

VM ml-staging-ctrl2001.codfw.wmnet rebooted by elukey@cumin1002 with reason: Increase vcores and memory