Page MenuHomePhabricator

Integrate the (8) existing dse-k8s worker nodes
Closed, ResolvedPublic5 Estimated Story Points

Event Timeline

BTullis renamed this task from Integrate the (4) existing dse-k8s worker nodes to Integrate the (8) existing dse-k8s worker nodes.Jul 14 2022, 11:18 AM

Change 826240 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Enable the dse-k8s-worker nodes

https://gerrit.wikimedia.org/r/826240

Change 826241 had a related patch set uploaded (by Btullis; author: Btullis):

[labs/private@master] Add dummy tokens for dse_k8s_workers

https://gerrit.wikimedia.org/r/826241

Change 826241 merged by Btullis:

[labs/private@master] Add dummy tokens for dse_k8s_workers

https://gerrit.wikimedia.org/r/826241

Change 826240 merged by Btullis:

[operations/puppet@production] Enable the dse-k8s-worker nodes

https://gerrit.wikimedia.org/r/826240

Nodes 1-4 have been integrated, but for some reason nodes 5-8 have not yet registered with the controller.

root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl get nodes
NAME                             STATUS   ROLES    AGE     VERSION
dse-k8s-ctrl1001.eqiad.wmnet     Ready    <none>   4m16s   v1.16.15
dse-k8s-ctrl1002.eqiad.wmnet     Ready    <none>   4m12s   v1.16.15
dse-k8s-worker1001.eqiad.wmnet   Ready    <none>   4m13s   v1.16.15
dse-k8s-worker1002.eqiad.wmnet   Ready    <none>   4m13s   v1.16.15
dse-k8s-worker1003.eqiad.wmnet   Ready    <none>   4m17s   v1.16.15
dse-k8s-worker1004.eqiad.wmnet   Ready    <none>   4m14s   v1.16.15

I have begun labelling the worker nodes.

Starting with labels for the reqion

root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1001.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1001.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1002.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1002.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1003.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1003.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1004.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1004.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1005.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad

Followed by labels for the rows

root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1001.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-a
node/dse-k8s-worker1001.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1002.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-b
node/dse-k8s-worker1002.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1003.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-c
node/dse-k8s-worker1003.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1004.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-d
node/dse-k8s-worker1004.eqiad.wmnet labeled

Change 828052 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Label the first four of the dse-k8s-worker nodes

https://gerrit.wikimedia.org/r/828052

The remaining four nodes just needed a reboot before the kubelet service started.
I have carried on with the labeling.

root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1005.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1005.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1006.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1006.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1007.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1007.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1008.eqiad.wmnet failure-domain.beta.kubernetes.io/region=eqiad
node/dse-k8s-worker1008.eqiad.wmnet labeled

Note the rack specific labels for servers in rows E-F

root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1005.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-e1
node/dse-k8s-worker1005.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1006.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-e3
node/dse-k8s-worker1006.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1007.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-f1
node/dse-k8s-worker1007.eqiad.wmnet labeled
root@deploy1002:/srv/deployment-charts/helmfile.d/admin_ng# kubectl label nodes dse-k8s-worker1008.eqiad.wmnet failure-domain.beta.kubernetes.io/zone=row-f3
node/dse-k8s-worker1008.eqiad.wmnet labeled

Change 828052 merged by Btullis:

[operations/puppet@production] Label the eight dse-k8s-worker nodes

https://gerrit.wikimedia.org/r/828052