Page MenuHomePhabricator

Failure of "kubectl get pods" for the editgroups project
Closed, ResolvedPublic

Description

When running kubectl get pods from the editgroups project on bastion, I get the following:

tools.editgroups@tools-sgebastion-07:~$ kubectl get pods
runtime: failed to create new OS thread (have 12 already; errno=11)
fatal error: newosproc

runtime stack:
runtime.throw(0x2689df0, 0x9)
	/usr/local/go/src/runtime/panic.go:547 +0x90
runtime.newosproc(0xc8204a8800, 0xc8204b7fc0)
	/usr/local/go/src/runtime/os1_linux.go:149 +0x18c
runtime.newm(0x2aa1df0, 0x0)
	/usr/local/go/src/runtime/proc.go:1516 +0x135
runtime.startTheWorldWithSema()
	/usr/local/go/src/runtime/proc.go:1041 +0x206
runtime.systemstack(0xc820001840)
	/usr/local/go/src/runtime/asm_amd64.s:291 +0x79
runtime.mstart()
	/usr/local/go/src/runtime/proc.go:1051

goroutine 12 [running]:
runtime.systemstack_switch()
	/usr/local/go/src/runtime/asm_amd64.s:245 fp=0xc82002cd10 sp=0xc82002cd08
runtime.gcMarkTermination()
	/usr/local/go/src/runtime/mgc.go:1260 +0x3a9 fp=0xc82002cf20 sp=0xc82002cd10
runtime.gcMarkDone()
	/usr/local/go/src/runtime/mgc.go:1146 +0x29e fp=0xc82002cf40 sp=0xc82002cf20
runtime.gcBgMarkWorker(0xc82001b500)
	/usr/local/go/src/runtime/mgc.go:1480 +0x488 fp=0xc82002cfb8 sp=0xc82002cf40
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82002cfc0 sp=0xc82002cfb8
created by runtime.gcBgMarkStartWorkers
	/usr/local/go/src/runtime/mgc.go:1330 +0x92

goroutine 1 [runnable, locked to thread]:
regexp.(*bitState).reset(0xc8202afea0, 0x20, 0x2)
	/usr/local/go/src/regexp/backtrack.go:87 +0x101
regexp.(*machine).backtrack(0xc820362ea0, 0x7f138f260758, 0xc820362f78, 0x0, 0x20, 0x2, 0x0)
	/usr/local/go/src/regexp/backtrack.go:324 +0x106
regexp.(*Regexp).doExecute(0xc8201225a0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc820247700, 0x20, 0x0, 0x2, ...)
	/usr/local/go/src/regexp/exec.go:449 +0x423
regexp.(*Regexp).replaceAll(0xc8201225a0, 0x0, 0x0, 0x0, 0xc820247700, 0x20, 0x2, 0xc820697d20, 0x0, 0x0, ...)
	/usr/local/go/src/regexp/regexp.go:501 +0x12d
regexp.(*Regexp).ReplaceAllString(0xc8201225a0, 0xc820247700, 0x20, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/regexp/regexp.go:460 +0x133
k8s.io/kubernetes/vendor/github.com/renstrom/dedent.Dedent(0xc820247700, 0x20, 0x0, 0x0)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/renstrom/dedent/dedent.go:46 +0x239
k8s.io/kubernetes/pkg/kubectl/cmd.init()
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/explain.go:43 +0xe60
k8s.io/kubernetes/cmd/kubectl/app.init()
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubectl/app/kubectl.go:39 +0x45
main.init()
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubectl/kubectl.go:30 +0x40

goroutine 9 [chan receive]:
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x3556da0)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/golang/glog/glog.go:879 +0x67
created by k8s.io/kubernetes/vendor/github.com/golang/glog.init.1
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/golang/glog/glog.go:410 +0x297

goroutine 35 [syscall]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:116 +0x132
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x37

The issue seems specific to this project as running the same command from other projects I have access to triggers no error.

Event Timeline

Pintoch created this task.Mar 23 2019, 2:08 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 23 2019, 2:08 PM
bd808 added a subscriber: bd808.Mar 24 2019, 5:29 AM
tools.editgroups@tools-sgebastion-07:~$ ps -U tools.editgroups
  PID TTY          TIME CMD
16795 pts/36   00:00:00 bash
27082 pts/45   00:00:00 bash
27119 pts/45   00:00:00 ps
31505 pts/36   00:00:00 webservice
31507 pts/36   00:00:10 kubectl

The pts/45 TTY processes are mine as I ran the check. The other 3 processes belong to pts/36:

$ ps -aef | grep pts/36
root     16789 28335  0 Mar23 pts/36   00:00:00 /usr/bin/sudo -niu tools.editgroups
tools.e+ 16795 16789  0 Mar23 pts/36   00:00:00 -bash
tools.e+ 27159 27082  0 02:27 pts/45   00:00:00 grep pts/36
pintoch  28335 28334  0 Mar22 pts/36   00:00:00 /bin/bash
tools.e+ 31505 16795  0 Mar23 pts/36   00:00:00 /usr/bin/python -Es /usr/local/bin/webservice --backend=kubernetes python shell
tools.e+ 31507 31505  0 Mar23 pts/36   00:00:10 /usr/bin/kubectl attach --tty --stdin interactive

There is a webservice shell process running as the tool:

pintoch  28334     1  0 Mar22 ?        00:00:03 SCREEN -S inspect
pintoch  28335 28334  0 Mar22 pts/36   00:00:00  \_ /bin/bash
root     16789 28335  0 Mar23 pts/36   00:00:00      \_ /usr/bin/sudo -niu tools.editgroups
tools.e+ 16795 16789  0 Mar23 pts/36   00:00:00          \_ -bash
tools.e+ 31505 16795  0 Mar23 pts/36   00:00:00              \_ /usr/bin/python -Es /usr/local/bin/webservice --backend=kubernetes python shell
tools.e+ 31507 31505  0 Mar23 pts/36   00:00:10                  \_ /usr/bin/kubectl attach --tty --stdin interactive

The next part is not obvious. For several years we have had cgroups restrictions on the Toolforge bastions to try an keep them responsive for use by everyone. On the old Trusty bastions we used something called cgred to apply limits after processes were launched. The new Debian Stretch bastions (login.tools.wmflabs.org, stretch-dev.tools.wmflabs.org) are able to use systemd to enforce these limits instead. This does a more effective job than the old system.

The current limits are:

[Slice]
# each user can use max this % of one CPU
CPUQuota=30%
# each user can run max this number of tasks/threads
TasksMax=100
# slow down procs if they use more than this memory
MemoryHigh=512M
# if more than this memory is used, OOM killer will step in
MemoryMax=1024M
# users can't use swap memory
MemorySwapMax=0
# do accounting but don't limit anything by now
IOAccounting=yes
IPAccounting=yes

In this case there are not enough resources left available in the cgroup to spawn the threads desired by the kubectl command (runtime: failed to create new OS thread (have 12 already; errno=11)).

$ sudo cat /proc/31507/cgroup
10:cpuset:/
9:devices:/user.slice
8:blkio:/user.slice/user-10394.slice
7:memory:/user.slice/user-10394.slice
6:freezer:/
5:net_cls,net_prio:/
4:perf_event:/
3:cpu,cpuacct:/user.slice/user-10394.slice
2:pids:/user.slice/user-10394.slice/session-22384.scope
1:name=systemd:/user.slice/user-10394.slice/session-22384.scope
0::/user.slice/user-10394.slice/session-22384.scope
$ cat /sys/fs/cgroup/pids/user.slice/user-10394.slice/pids.current
25

The currently resident processes seem to be consuming 25 of the available 100 tasks. Running kubectl get pods under gdb seems to show it wanting to create 12 threads. This seems far less than the theoretically remaining quota, but I can reproduce the original crash still when running as the tools.editgroups user, so I think I'm missing something in trying to determine the actual usage.

I was about to give up and tell you "sorry, do not try to run so many things at once", but a bit of searching led me to a trick that can be used to limit the parallel threads of a go binary, GOMAXPROCS. With this I can actually get the desired command to run to completion:

$ GOMAXPROCS=1 kubectl get pods
NAME                                      READY     STATUS    RESTARTS   AGE
editgroups-2937528593-1oxnz               1/1       Running   0          1d
editgroups.celery.sh-3671823902-za5bk     1/1       Running   0          2d
editgroups.listener.sh-1891604770-n2b3y   1/1       Running   0          8h
interactive                               1/1       Running   0          1d

I think there is still something I'm not getting right about following the cgroup accounting though.

Woaw, thanks for the very thorough analysis! I cannot reproduce this anymore. Thanks for the GOMAXPROCS trick! I will add it to the docs.

Pintoch closed this task as Resolved.Mar 24 2019, 7:57 AM
Pintoch claimed this task.

Just confirming that the bug has occurred today again and the proposed fix worked perfectly. Thanks again!