Page MenuHomePhabricator

`kubectl get pods` fails after switching to new k8s cluster
Closed, InvalidPublic

Description

$ become fourohfour
$ webservice stop
$ kubectl config use-context toolforge
$ webservice --backend=kubernetes python3.5 start
$ kubectl get po
error: group map[authentication.k8s.io:0xc8203c2ee0 extensions:0xc820144e00 rbac.authorization.k8s.io:0xc820122070 federation:0xc8203c2930 :0xc8203c2e00 authorization.k8s.io:0xc8203c3030 autoscaling:0xc8203c30a0 policy:0xc820122000 storage.k8s.io:0xc8201220e0 events.k8s.io:0xc8203747e0 apps:0xc8203c2e70 batch:0xc8203c3260 certificates.k8s.io:0xc8203c32d0 componentconfig:0xc8203c3340] is already registered
$ which kubectl
/usr/local/bin/kubectl
$ kubectl config current-context
toolforge
$ webservice status
Your webservice of type python3.5 is running

This error seems to happen for any kubectl get or describe command.

Event Timeline

bd808 created this task.Dec 17 2019, 5:58 AM
aborrero claimed this task.Dec 17 2019, 1:51 PM
aborrero triaged this task as High priority.
aborrero moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.
aborrero added a subscriber: aborrero.

The version of kubectl in tools bastions are outdated:

tools.fourohfour@tools-sgebastion-09:~$ /usr/local/bin/kubectl version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.12", GitCommit:"19e81afecf5eb2b7838c35e2cbf776aff04dc34c", GitTreeState:"clean", BuildDate:"2017-04-20T21:01:06Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

we need to bump that.

As part of migrating to the new cluster, a bash alias should be created to use /usr/bin/kubectl. This is known to be part of the migration process.

tools.fourohfour@tools-sgebastion-09:~$ /usr/bin/kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:20:18Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Other more important issues is that the bastion limits the amount thread by means of systemd, and newer kubectl requires more:

tools.fourohfour@tools-sgebastion-09:~$ /usr/bin/kubectl get pods
runtime: failed to create new OS thread (have 30 already; errno=11)
runtime: may need to increase max user processes (ulimit -u)
fatal error: newosproc

but this is probably a topic for other ticket.

aborrero closed this task as Invalid.Dec 17 2019, 2:00 PM

Change 558523 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: bastion: raise default value for nproc

https://gerrit.wikimedia.org/r/558523

Change 558523 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: bastion: raise default value for nproc

https://gerrit.wikimedia.org/r/558523