Page MenuHomePhabricator

Error joining new worker node to Toolforge Kubernetes cluster
Closed, ResolvedPublicSecurity

Description

I built tools-k8s-worker-[6-14] and now I want to join them to the cluster.

$ ssh root@tools-k8s-control-1.tools.eqiad.wmflabs
$ kubeadm token create
54uehz.m8phs2y9tubxp92o
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
1cbcba20a201006b0359d5884e94567a07a8d809adcc8ad4f8402a64f57ad45b
$ exit

$ ssh root@tools-k8s-worker-6.tools.eqiad.wmflabs
$ kubeadm join k8s.tools.eqiad1.wikimedia.cloud:6443 --token 54uehz.m8phs2y9tubxp92o --discovery-token-ca-cert-hash sha256:1cbcba20a201006b0359d5884e94567a07a8d809adcc8ad4f8402a64f57ad45b
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.5. Latest validated version: 18.09
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to decode cluster configuration data: v1beta2.ClusterConfiguration.APIServer: v1beta2.APIServer.ControlPlaneComponent: ExtraVolumes: []v1beta2.HostPathMount: decode slice: expect [ or n, but found {, error found in #10 byte of ...|Volumes":{"hostPath"|..., bigger context ...|E_ECDSA_WITH_AES_256_GCM_SHA384"},"extraVolumes":{"hostPath":"/etc/kubernetes/admission","mountPath"|...

Details

Related Gerrit Patches:

Event Timeline

bd808 created this task.Tue, Jan 7, 5:44 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptTue, Jan 7, 5:44 AM
bd808 added a comment.EditedTue, Jan 7, 5:54 AM

I wonder if this hunk of the config map (kubectl -n kube-system get cm kubeadm-config -oyaml):

extraVolumes:
  name: "/etc/kubernetes/admission"
  hostPath: "/etc/kubernetes/admission"
  mountPath: "/etc/kubernetes/admission"
  readOnly: true
  pathType: Directory

should really look like:

extraVolumes:
  - name: "/etc/kubernetes/admission"
    hostPath: "/etc/kubernetes/admission"
    mountPath: "/etc/kubernetes/admission"
    readOnly: true
    pathType: Directory
aborrero triaged this task as High priority.Tue, Jan 7, 11:01 AM
aborrero moved this task from Inbox to Important on the cloud-services-team (Kanban) board.

Never saw this error before. But yes, a syntax error could make sense. The next question would be how that was accepted by the API in the first place or how is the API producing a non-valid YAML.

Can this error be reproduced in toolsbeta?

Bstorm added a comment.Tue, Jan 7, 3:17 PM

That'd be my fault. Fixing.

Bstorm added a comment.Tue, Jan 7, 3:19 PM

That is not non-valid YAML. That is invalid config. The API doesn't care what goes in a configMap, and that is only validated when passed through kubeadm. When I changed the cluster design, I had to update it by hand. I made a mistake, clearly.

Bstorm added a comment.Tue, Jan 7, 3:21 PM

I'm checking if that extraVolumes takes an array or hash in the source documentation (because it is only documented there).

Bstorm added a comment.Tue, Jan 7, 3:25 PM

Done. Please try again. https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2 <-- It is supposed to be a list/array there.

Bstorm added a comment.Tue, Jan 7, 3:26 PM

Fixing in toolsbeta and puppet if it works :)

Change 562532 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] toolforge-k8s: switch extraVolumes to an array

https://gerrit.wikimedia.org/r/562532

bd808 added a comment.Tue, Jan 7, 3:34 PM
$ kubeadm join k8s.tools.eqiad1.wikimedia.cloud:6443 --token 54uehz.m8phs2y9tubxp92o --discovery-token-ca-cert-hash sha256:1cbcba20a201006b0359d5884e94567a07a8d809adcc8ad4f8402a64f57ad45b
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.5. Latest validated version: 18.09
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

Mentioned in SAL (#wikimedia-cloud) [2020-01-07T15:35:35Z] <bstorm_> changed kubeadm-config to use a list instead of a hash for extravols on the apiserver in the new k8s cluster T242067

Bstorm added a comment.Tue, Jan 7, 3:38 PM

Looks good to me. Sorry!

bd808 raised the priority of this task from High to Needs Triage.Tue, Jan 7, 3:42 PM
bd808 set Security to Software security bug.
bd808 added a project: Security.
bd808 changed the visibility from "Public (No Login Required)" to "Custom Policy".
bd808 changed the subtype of this task from "Task" to "Security Issue".

Tokens are live

bd808 closed this task as Resolved.Tue, Jan 7, 3:43 PM
bd808 assigned this task to Bstorm.
bd808 triaged this task as High priority.
bd808 removed a project: Patch-For-Review.
Bstorm added a comment.Tue, Jan 7, 3:44 PM

Turns out deleting a bootstrap token is dead easy:

root@tools-k8s-control-1:~# kubeadm token delete 54uehz.m8phs2y9tubxp92o
bootstrap token "54uehz" deleted
bd808 changed the visibility from "Custom Policy" to "Public (No Login Required)".
Restricted Application added a project: Security. · View Herald TranscriptTue, Jan 7, 3:58 PM

Change 562532 merged by Bstorm:
[operations/puppet@production] toolforge-k8s: switch extraVolumes to an array

https://gerrit.wikimedia.org/r/562532