Page MenuHomePhabricator

EPIC: Build dse-k8s-codfw Kubernetes cluster
Closed, ResolvedPublic

Description

DPE SRE is providing a new OpenSearch service, "Mutualized OpenSearch," which will run on Kubernetes (more details in parent ticket).

The service needs to be available in both primary datacenters, but we don't have a DSE k8s cluster in CODFW yet.

Creating this ticket as an epic/parent ticket for all tasks needed to bring the dse-k8s-codfw cluster online.

Event Timeline

BTullis triaged this task as High priority.Jul 4 2025, 4:43 PM
BTullis moved this task from Epics to Quarterly Goals on the Data-Platform-SRE board.

Change #1184043 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Upgrade the dse-k8s-codfw cluster to version 1.31

https://gerrit.wikimedia.org/r/1184043

Change #1184043 merged by Btullis:

[operations/puppet@production] Upgrade the dse-k8s-codfw cluster to version 1.31

https://gerrit.wikimedia.org/r/1184043

BTullis mentioned this in Unknown Object (Task).Sep 23 2025, 10:49 AM

The dse-k8s-codfw cluster is up and running and the components can be verified as below:
A simple PVC definition as a raw block device

root@deploy2002:~# kube_env admin dse-k8s-codfw
root@deploy2002:~# kubectl create namespace stevemunene-pvc-tests
namespace/stevemunene-pvc-tests created
root@deploy2002:~# kubectl get namespaces
NAME                    STATUS   AGE
analytics-test          Active   47h
cert-manager            Active   10d
default                 Active   10d
echoserver              Active   10d
external-services       Active   10d
istio-system            Active   10d
kube-node-lease         Active   10d
kube-public             Active   10d
kube-system             Active   10d
opensearch-ipoid        Active   10d
opensearch-ipoid-test   Active   10d
opensearch-operator     Active   10d
opensearch-test         Active   10d
sidecar-controller      Active   10d
stevemunene-pvc-tests   Active   13s


root@deploy2002:~# cat /home/stevemunene/raw-block-pvc.yaml 
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: raw-block-pvc
  namespace: stevemunene-pvc-tests
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Block
  resources:
    requests:
      storage: 1Gi
  storageClassName: ceph-rbd-ssd

create a very simple pod which attempts to bind this pvc.

---
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-raw-block-volume
  namespace: stevemunene-pvc-tests
spec:
  containers:
    - name: do-nothing
      image: docker-registry.discovery.wmnet/bookworm:20240630
      command: ["/bin/sh", "-c"]
      args: ["tail -f /dev/null"]
      volumeDevices:
        - name: data
          devicePath: /dev/xvda
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: raw-block-pvc

Apply and watch

root@deploy2002:~kubectl apply -f /home/stevemunene/raw-block-pod.yamlml
pod/pod-with-raw-block-volume created
root@deploy2002:~# kubectl -n stevemunene-pvc-tests get events -w
LAST SEEN   TYPE      REASON                 OBJECT                                MESSAGE
4s          Warning   FailedScheduling       pod/pod-with-raw-block-volume         0/4 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
3m25s       Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
0s          Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

Pending for a while now,
To investigate we have

Still to do:

  • check if there are improvements in the Ceph plugin to see if it is worth upgrading the Ceph plugin or something else (kernel?), by reading the release notes
  • if needed, do the appropriate upgrades (create a subtask for it)

Going through the ceph-csi-releases for any change that might have impacted us in the 1.31 upgrade.

For now I have cleared the current deployment as it was still in a waiting state for a while ie.

root@deploy2002:~# kubectl -n stevemunene-pvc-tests get events -w
LAST SEEN   TYPE      REASON                 OBJECT                                MESSAGE
6m30s       Warning   FailedScheduling       pod/pod-with-raw-block-volume         0/4 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
2m32s       Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
0s          Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
0s          Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
0s          Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

delete pvc and pod

root@deploy2002:~# kubectl delete -f /home/stevemunene/raw-block-pvc.yaml 
persistentvolumeclaim "raw-block-pvc" deleted
root@deploy2002:~# kubectl delete  -f /home/stevemunene/raw-block-pod.yaml
pod "pod-with-raw-block-volume" deleted
root@deploy2002:~#

Change #1193190 had a related patch set uploaded (by Stevemunene; author: Stevemunene):

[operations/deployment-charts@master] Add test namespace to ceph tenantNamepsaces dse-k8s-codfw

https://gerrit.wikimedia.org/r/1193190

Change #1193190 merged by jenkins-bot:

[operations/deployment-charts@master] Add test namespace to ceph tenantNamepsaces dse-k8s-codfw

https://gerrit.wikimedia.org/r/1193190

Change #1193369 had a related patch set uploaded (by Stevemunene; author: Stevemunene):

[operations/deployment-charts@master] Fix typo in cephfs values namespaces

https://gerrit.wikimedia.org/r/1193369

Change #1193369 merged by jenkins-bot:

[operations/deployment-charts@master] Fix typo in cephfs values namespaces

https://gerrit.wikimedia.org/r/1193369

The initial error was from the fact that we did not have the namespace defined in the list of available tenandNamespaces. This was added, then proceeded to recreate the pvc in the namespace and we are now seeing a different error message.

root@deploy2002:~# kubectl apply -f /home/stevemunene/raw-block-pvc.yaml 
persistentvolumeclaim/raw-block-pvc created
root@deploy2002:~# kubectl apply -f /home/stevemunene/raw-block-pod.yaml
pod/pod-with-raw-block-volume created
root@deploy2002:~# kubectl -n stevemunene-pvc-tests get events -w
LAST SEEN   TYPE      REASON                 OBJECT                                MESSAGE
9s          Warning   FailedScheduling       pod/pod-with-raw-block-volume         0/4 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
11s         Normal    Provisioning           persistentvolumeclaim/raw-block-pvc   External provisioner is provisioning volume for claim "stevemunene-pvc-tests/raw-block-pvc"
14s         Normal    ExternalProvisioning   persistentvolumeclaim/raw-block-pvc   Waiting for a volume to be created either by the external provisioner 'rbd.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

Change #1193406 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/alerts@master] Redirect helmfile_admin_ng_pending_changes alerts for both dse-k8s-eqiad/codfw to dpe sre

https://gerrit.wikimedia.org/r/1193406

Change #1193406 merged by Brouberol:

[operations/alerts@master] Redirect helmfile_admin_ng_pending_changes alerts for both dse-k8s-eqiad/codfw to dpe sre

https://gerrit.wikimedia.org/r/1193406

Still to do:

  • check if there are improvements in the Ceph plugin to see if it is worth upgrading the Ceph plugin or something else (kernel?), by reading the release notes
  • if needed, do the appropriate upgrades (create a subtask for it)

I have created: T407166: Upgrade the ceph-csi-plugin to the latest release compatible with kubernetes version 1.31

I think that we can wait until we have planned out the upgrade of dse-k8s-eqiad to Kubernetes version 1.31 before looking again at the version of the ceph-csi plugin.
What we have shown in T404576 is that version 3.7.2 of the ceph-csi plugin works with kubernetes version 1.31 - which is great!

Hopefully over the next couple of weeks we will get more confident in this version match and that will mean that we don't have to upgrade both k8s and the ceph-csi plugin together on dse-k8s-eqiad.
It would be ideal if we could upgrade k8s first, then work on the csi plugin afterwards.

Being bold and closing this epic.

Change #1216832 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] dse-k8s-codfw: enable pod-to-pod traffic cluster-wide

https://gerrit.wikimedia.org/r/1216832

Change #1216832 merged by Bking:

[operations/deployment-charts@master] dse-k8s-codfw: enable pod-to-pod traffic cluster-wide

https://gerrit.wikimedia.org/r/1216832

Change #1238441 had a related patch set uploaded (by Bking; author: Bking):

[operations/dns@master] dse-k8s: Enable active/active for dse-k8s clusters

https://gerrit.wikimedia.org/r/1238441

Change #1238832 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s-ingress: Enable active-active

https://gerrit.wikimedia.org/r/1238832