Page MenuHomePhabricator

`kubectl get pods` fails after switching to new k8s cluster
Closed, InvalidPublic

Description

$ become fourohfour
$ webservice stop
$ kubectl config use-context toolforge
$ webservice --backend=kubernetes python3.5 start
$ kubectl get po
error: group map[authentication.k8s.io:0xc8203c2ee0 extensions:0xc820144e00 rbac.authorization.k8s.io:0xc820122070 federation:0xc8203c2930 :0xc8203c2e00 authorization.k8s.io:0xc8203c3030 autoscaling:0xc8203c30a0 policy:0xc820122000 storage.k8s.io:0xc8201220e0 events.k8s.io:0xc8203747e0 apps:0xc8203c2e70 batch:0xc8203c3260 certificates.k8s.io:0xc8203c32d0 componentconfig:0xc8203c3340] is already registered
$ which kubectl
/usr/local/bin/kubectl
$ kubectl config current-context
toolforge
$ webservice status
Your webservice of type python3.5 is running

This error seems to happen for any kubectl get or describe command.

Event Timeline

aborrero triaged this task as High priority.
aborrero moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.
aborrero added a subscriber: aborrero.

The version of kubectl in tools bastions are outdated:

tools.fourohfour@tools-sgebastion-09:~$ /usr/local/bin/kubectl version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.12", GitCommit:"19e81afecf5eb2b7838c35e2cbf776aff04dc34c", GitTreeState:"clean", BuildDate:"2017-04-20T21:01:06Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

we need to bump that.

As part of migrating to the new cluster, a bash alias should be created to use /usr/bin/kubectl. This is known to be part of the migration process.

tools.fourohfour@tools-sgebastion-09:~$ /usr/bin/kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:20:18Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Other more important issues is that the bastion limits the amount thread by means of systemd, and newer kubectl requires more:

tools.fourohfour@tools-sgebastion-09:~$ /usr/bin/kubectl get pods
runtime: failed to create new OS thread (have 30 already; errno=11)
runtime: may need to increase max user processes (ulimit -u)
fatal error: newosproc

but this is probably a topic for other ticket.

Change 558523 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: bastion: raise default value for nproc

https://gerrit.wikimedia.org/r/558523

Change 558523 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: bastion: raise default value for nproc

https://gerrit.wikimedia.org/r/558523