Page MenuHomePhabricator

Setup the webservice-related instances in toolsbeta
Closed, ResolvedPublic

Description

Toolforge has these instances types:

1tools-bastion-NN.tools.eqiad.wmflabs
2tools-checker-NN.tools.eqiad.wmflabs
3tools-clushmaster-NN.tools.eqiad.wmflabs
4tools-cron-NN.tools.eqiad.wmflabs
5tools-docker-builder-NN.tools.eqiad.wmflabs
6tools-docker-registry-NN.tools.eqiad.wmflabs
7tools-elastic-NN.tools.eqiad.wmflabs
8tools-exec-NNNN.tools.eqiad.wmflabs
9tools-exec-gift-trusty-NN.tools.eqiad.wmflabs
10tools-flannel-etcd-NN.tools.eqiad.wmflabs
11tools-grid-master.tools.eqiad.wmflabs
12tools-grid-shadow.tools.eqiad.wmflabs
13tools-k8s-etcd-NN.tools.eqiad.wmflabs
14tools-k8s-master-NN.tools.eqiad.wmflabs
15tools-logs-NN.tools.eqiad.wmflabs
16tools-mail.tools.eqiad.wmflabs
17tools-package-builder-NN.tools.eqiad.wmflabs
18tools-paws-master-NN.tools.eqiad.wmflabs
19tools-paws-worker-NNNN.tools.eqiad.wmflabs
20tools-prometheus-NN.tools.eqiad.wmflabs
21tools-proxy-NN.tools.eqiad.wmflabs
22tools-puppetmaster-NN.tools.eqiad.wmflabs
23tools-redis-NNNN.tools.eqiad.wmflabs
24tools-services-NN.tools.eqiad.wmflabs
25tools-static-NN.tools.eqiad.wmflabs
26tools-webgrid-generic-NNNN.tools.eqiad.wmflabs
27tools-webgrid-lighttpd-NNNN.tools.eqiad.wmflabs
28tools-worker-NNNN.tools.eqiad.wmflabs

Of these:
PAWS:

  • tools-paws-master-NN.tools.eqiad.wmflabs
  • tools-paws-worker-NNNN.tools.eqiad.wmflabs

We are already using these to test PAWS.

Grid:

  • tools-grid-master.tools.eqiad.wmflabs
  • tools-grid-shadow.tools.eqiad.wmflabs

As toolsbeta-gridmaster-01.toolsbeta.eqiad.wmflabs

  • tools-exec-NNNN.tools.eqiad.wmflabs

As toolsbeta-grid-exec-1.toolsbeta.eqiad.wmflabs

  • tools-puppetmaster-NN.tools.eqiad.wmflabs

As toolsbeta-grid-puppetmaster.toolsbeta.eqiad.wmflabs?

  • tools-webgrid-generic-NNNN.tools.eqiad.wmflabs

Probably needed?

  • tools-webgrid-lighttpd-NNNN.tools.eqiad.wmflabs

toolsbeta-grid-webgrid-lighttpd-1.toolsbeta.eqiad.wmflabs

Of the rest, those probably unnecessary-for-webservice ones are:

  • tools-clushmaster-NN.tools.eqiad.wmflabs
  • tools-checker-NN.tools.eqiad.wmflabs
  • tools-cron-NN.tools.eqiad.wmflabs
  • tools-elastic-NN.tools.eqiad.wmflabs
  • tools-logs-NN.tools.eqiad.wmflabs
  • tools-mail.tools.eqiad.wmflabs
  • tools-prometheus-NN.tools.eqiad.wmflabs
  • tools-redis-NNNN.tools.eqiad.wmflabs

Not sures:

  • tools-services-NN.tools.eqiad.wmflabs

This runs webservicemonitor

  • tools-docker-builder-NN.tools.eqiad.wmflabs
  • tools-docker-registry-NN.tools.eqiad.wmflabs

This hosts self-built k8s docker containers. Use toolforge's?

  • tools-package-builder-NN.tools.eqiad.wmflabs

This builds packages. Can we use the package releases from toolforge aptly?

  • tools-static-NN.tools.eqiad.wmflabs

This is static and shouldn't affect 'dynamic' webservices a lot.

Definitely needed:

  • tools-bastion-NN.tools.eqiad.wmflabs

The host in which webservice command should be executed from

  • tools-k8s-etcd-NN.tools.eqiad.wmflabs
  • tools-k8s-master-NN.tools.eqiad.wmflabs
  • tools-flannel-etcd-NN.tools.eqiad.wmflabs
  • tools-worker-NNNN.tools.eqiad.wmflabs

k8s related (https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes#Components)

  • tools-proxy-NN.tools.eqiad.wmflabs

The url-dynamicproxy for webservice to register the service, and hosts the redis which dynamicproxy reads.

Event Timeline

zhuyifei1999 created this task.

Mentioned in SAL (#wikimedia-cloud) [2018-03-30T22:40:09Z] <zhuyifei1999_> copied over many prefix puppet configuration in horizon from toolforge T190893

Hello @zhuyifei1999 . Thank you for helping with setting up a testing environment for the webservice command.

@zhuyifei1999 , @bd808 , I wish to explore toolsbeta as a testing environment this week and will love to have some tutorial on how toolsbeta functions, my search so far has been unsuccessful.

On the other hand, I have been playing around with the webservice from my tools account and reproducing some of the issues.

Mentioned in SAL (#wikimedia-cloud) [2018-04-19T21:43:15Z] <zhuyifei1999_> Start creating instances for webservice setup T190893

After many experimentation (P7069, why can't this be puppetized?), grid now works (I hope).

etcd refuse to start: P7070, so I guess k8s master can't start because of this.
proxy nginx refuse to start due to SSL cert issues: P7071. Apparently the public and private certs in labs/private (the public gerrit version) mismatches? I'm unable to make toollabs::proxy::ssl_certificate_name false in hiera due to evaluation errors.

etcd refuse to start: P7070

Fixed this by applying "etcd::cluster_state": new instead of "etcd::cluster_state": existing in hiera

maintain-kubeusers can't run, which might be the reason why the tokenauth file is missing:

zhuyifei1999@toolsbeta-k8s-master-01:~$ maintain-kubeusers 
Traceback (most recent call last):
  File "/usr/local/bin/maintain-kubeusers", line 17, in <module>
    import yaml
ImportError: No module named 'yaml'
zhuyifei1999@toolsbeta-k8s-master-01:~$ grep import `which maintain-kubeusers` | xargs -L1 -d'\n' -t python3 -c
python3 -c import logging 
python3 -c import argparse 
python3 -c import ldap3 
python3 -c import yaml 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named 'yaml'
python3 -c import json 
python3 -c import subprocess 
python3 -c import os 
python3 -c import string 
python3 -c import random 
python3 -c import time 
python3 -c import csv 
python3 -c import stat 
python3 -c         # but not to edit them. This is important, because
zhuyifei1999@toolsbeta-k8s-master-01:~$ apt-cache policy python3-yaml
python3-yaml:
  Installed: (none)
  Candidate: 3.11-2
  Version table:
     3.11-2 0
        500 http://httpredir.debian.org/debian/ jessie/main amd64 Packages

But on toolforge:

root@tools-k8s-master-01:~# ls /etc/kubernetes/tokenauth -l
-rw-r--r-- 1 kube kube 178770 May  2 13:18 /etc/kubernetes/tokenauth
root@tools-k8s-master-01:~# apt-cache policy python3-yaml
python3-yaml:
  Installed: 3.11-2
  Candidate: 3.11-2
  Version table:
 *** 3.11-2 0
        500 http://mirrors.wikimedia.org/debian/ jessie/main amd64 Packages
        100 /var/lib/dpkg/status

Change 430539 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/puppet@production] maintain_kubeusers.pp: use require_package and add python3-yaml

https://gerrit.wikimedia.org/r/430539

Got the proxy (hopefully) working, and then attempted to start admin tool's webservice:

toolsbeta.admin@toolsbeta-bastion-01:~$ webservice start
Traceback (most recent call last):
  File "/usr/local/bin/webservice", line 8, in <module>
    from toollabs.webservice.backends import Backend, GridEngineBackend, KubernetesBackend
  File "/usr/lib/python2.7/dist-packages/toollabs/webservice/backends/__init__.py", line 3, in <module>
    from toollabs.webservice.backends.kubernetesbackend import KubernetesBackend
  File "/usr/lib/python2.7/dist-packages/toollabs/webservice/backends/kubernetesbackend.py", line 3, in <module>
    import pykube
  File "/usr/lib/python2.7/dist-packages/pykube/__init__.py", line 8, in <module>
    from .objects import (  # noqa
  File "/usr/lib/python2.7/dist-packages/pykube/objects.py", line 16, in <module>
    @six.python_2_unicode_compatible
AttributeError: 'module' object has no attribute 'python_2_unicode_compatible'

oh different versions again:

06:50:53 0 ✓ zhuyifei1999@tools-bastion-02: ~$ apt-cache policy python-six
python-six:
  Installed: 1.9.0-1~trusty1
  Candidate: 1.9.0-1~trusty1
  Version table:
 *** 1.9.0-1~trusty1 0
       1001 http://apt.wikimedia.org/wikimedia/ trusty-wikimedia/universe amd64 Packages
        100 /var/lib/dpkg/status
     1.5.2-1ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
     1.5.2-1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
toolsbeta.admin@toolsbeta-bastion-01:~$ apt-cache policy python-six
python-six:
  Installed: 1.5.2-1ubuntu1
  Candidate: 1.9.0-1~trusty1
  Version table:
     1.9.0-1~trusty1 0
       1001 http://apt.wikimedia.org/wikimedia/ trusty-wikimedia/universe amd64 Packages
 *** 1.5.2-1ubuntu1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     1.5.2-1 0
        500 http://nova.clouds.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
N: Ignoring file '20auto-upgrades.ucf-dist' in directory '/etc/apt/apt.conf.d/' as it has an invalid filename extension

So this is fixed after $ sudo apt-get upgrade, but then:

toolsbeta.admin@toolsbeta-bastion-01:~$ webservice start
Traceback (most recent call last):
  File "/usr/local/bin/webservice", line 93, in <module>
    tool = Tool.from_currentuser()
  File "/usr/lib/python2.7/dist-packages/toollabs/common/tool.py", line 94, in from_currentuser
    return Tool.from_pwd(pwd_entry)
  File "/usr/lib/python2.7/dist-packages/toollabs/common/tool.py", line 103, in from_pwd
    'Tool username should begin with tools.')
toollabs.common.tool.InvalidToolException: Tool username should begin with tools.
toollabs.common.tool.InvalidToolException: Tool username should begin with tools.

Yuck. PREFIX = 'tools.' is hardcoded in the toollabs.common.Tool class found in toollabs/common/tool.py. Looks like we need to do some refactoring to introduce a configuration file for the webservice command so we can vary that by environment.

Change 430647 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/software/tools-webservice@master] Load project name dynamically from /etc/wmcs-project

https://gerrit.wikimedia.org/r/430647

I installed the python3-yaml package directly (sal, should have linked here), but it still refuse to start:

May 05 18:48:27 toolsbeta-k8s-master-01 maintain-kubeusers[15393]: Traceback (most recent call last):
May 05 18:48:27 toolsbeta-k8s-master-01 maintain-kubeusers[15393]: File "/usr/local/bin/maintain-kubeusers", line 356, in <module>
May 05 18:48:27 toolsbeta-k8s-master-01 maintain-kubeusers[15393]: cur_users = get_users_from_csv(args.tokenauth_output_path)
May 05 18:48:27 toolsbeta-k8s-master-01 maintain-kubeusers[15393]: File "/usr/local/bin/maintain-kubeusers", line 80, in get_users_from_csv
May 05 18:48:27 toolsbeta-k8s-master-01 maintain-kubeusers[15393]: with open(path, encoding='utf-8') as csvfile:
May 05 18:48:27 toolsbeta-k8s-master-01 maintain-kubeusers[15393]: FileNotFoundError: [Errno 2] No such file or directory: '/etc/kubernetes/tokenauth'

I guess I'll just bootstrap it with the same permission set as toolforge:

root@tools-k8s-master-01:~# stat /etc/kubernetes/tokenauth
  File: ‘/etc/kubernetes/tokenauth’
  Size: 179028    	Blocks: 352        IO Block: 4096   regular file
Device: fe03h/65027d	Inode: 393837      Links: 1
Access: (0644/-rw-r--r--)  Uid: (  116/    kube)   Gid: (  121/    kube)
Access: 2018-05-05 18:02:13.557787305 +0000
Modify: 2018-05-05 18:02:08.901707414 +0000
Change: 2018-05-05 18:02:08.901707414 +0000
 Birth: -
[...]
zhuyifei1999@toolsbeta-k8s-master-01:~$ sudo stat /etc/kubernetes/tokenauth
  File: ‘/etc/kubernetes/tokenauth’
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fe03h/65027d	Inode: 1051477     Links: 1
Access: (0644/-rw-r--r--)  Uid: (  105/    kube)   Gid: (  110/    kube)
Access: 2018-05-05 18:51:33.342094774 +0000
Modify: 2018-05-05 18:51:33.342094774 +0000
Change: 2018-05-05 18:51:33.342094774 +0000
 Birth: -

Would it make sense to either make maintain-kubeusers to ignore the file being missing, or have puppet create the file?

Change 431110 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/puppet@production] maintain-kubeusers.systemd: get the project name from @labsproject

https://gerrit.wikimedia.org/r/431110

Mentioned in SAL (#wikimedia-cloud) [2018-05-05T19:51:36Z] <zhuyifei1999_> systemctl mask maintain-kubeusers because it's causing a mess, tries to get the tool list from toolforge T190893

Got the k8s apiserver up. After tokenauth it complained about May 05 20:06:21 toolsbeta-k8s-master-01 kube-apiserver[9888]: E0505 20:06:21.748829 9888 genericapiserver.go:742] Unable to listen for secure (crypto/tls: failed to parse private key); will try again., and I applied "role::toollabs::k8s::master::use_puppet_certs": true and it seems to work.

maintain-kubeusers seems to be able to run successfully if invoked with the correct project name (toolsbeta):

1zhuyifei1999@toolsbeta-k8s-master-01:~$ sudo kubectl delete namespace test test2 admin toolschecker
2namespace "test" deleted
3namespace "test2" deleted
4namespace "admin" deleted
5namespace "toolschecker" deleted
6zhuyifei1999@toolsbeta-k8s-master-01:~$ sudo truncate -s 0 /etc/kubernetes/abac
7zhuyifei1999@toolsbeta-k8s-master-01:~$ sudo truncate -s 0 /etc/kubernetes/tokenauth
8zhuyifei1999@toolsbeta-k8s-master-01:~$ sudo rm -rv /data/project/{test,test2,admin,toolschecker}/.kube/
9removed ‘/data/project/test/.kube/config’
10removed directory: ‘/data/project/test/.kube/’
11removed ‘/data/project/test2/.kube/config’
12removed directory: ‘/data/project/test2/.kube/’
13removed ‘/data/project/admin/.kube/config’
14removed directory: ‘/data/project/admin/.kube/’
15removed ‘/data/project/toolschecker/.kube/config’
16removed directory: ‘/data/project/toolschecker/.kube/’
17zhuyifei1999@toolsbeta-k8s-master-01:~$ sudo /usr/local/bin/maintain-kubeusers --infrastructure-users /etc/kubernetes/infrastructure-users --project toolsbeta https://toolsbeta-k8s-master-01.toolsbeta.eqiad.wmflabs:6443 /etc/kubernetes/tokenauth /etc/kubernetes/abac
18starting a run
19Provisioned creds for infra user client-infrastructure
20Provisioned creds for infra user proxy-infrastructure
21Homedir already exists for /data/project/toolschecker
22Wrote config in /data/project/toolschecker/.kube/config
23(b'namespace "toolschecker" created\n', b'')
24Provisioned creds for tool toolschecker
25Homedir already exists for /data/project/test
26Wrote config in /data/project/test/.kube/config
27(b'namespace "test" created\n', b'')
28Provisioned creds for tool test
29Homedir already exists for /data/project/admin
30Wrote config in /data/project/admin/.kube/config
31(b'namespace "admin" created\n', b'')
32Provisioned creds for tool admin
33Homedir already exists for /data/project/test2
34Wrote config in /data/project/test2/.kube/config
35(b'namespace "test2" created\n', b'')
36Provisioned creds for tool test2
37Provisioned creds for infra user prometheus
38finished run, wrote 7 new accounts
39^CTraceback (most recent call last):
40 File "/usr/local/bin/maintain-kubeusers", line 405, in <module>
41 time.sleep(args.interval)
42KeyboardInterrupt

Change 431285 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/puppet@production] toolforge k8s: allow /etc/wmcs-project to be mounted

https://gerrit.wikimedia.org/r/431285

After locally patching the relevant webservice code on prefixes, webservice on grid seems to work: http://tools-beta.wmflabs.org/ (a nostalgic look; the response code is still 500 though) finally!

K8s containers have the relevant webservice code built into the pods so they will fail if the newer webservice code is not built. Also, toolsbeta-proxy-01 continues to spew random error messages that make successfully running the webservice on k8s at its current stage even less likely (will debug later):

May 06 04:19:40 toolsbeta-proxy-01 flanneld[19093]: E0506 04:19:40.593105 19093 network.go:53] Failed to retrieve network config: 100: Key not found (/coreos.com) [3]

After copying the replica.my.cnf from tools.zhuyifei1999-test, https://tools-beta.wmflabs.org/ works perfectly :)

Now waiting for the above patches to be merged & deployed.

Change 430539 merged by Bstorm:
[operations/puppet@production] maintain_kubeusers.pp: use require_package and add python3-yaml

https://gerrit.wikimedia.org/r/430539

Change 431285 merged by Andrew Bogott:
[operations/puppet@production] toolforge k8s: allow /etc/wmcs-project to be mounted

https://gerrit.wikimedia.org/r/431285

Change 431110 merged by Andrew Bogott:
[operations/puppet@production] maintain-kubeusers.systemd: get the project name from @labsproject

https://gerrit.wikimedia.org/r/431110

Change 431618 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/puppet@production] Unbreak maintain_kubeusers.pp from dependency cycle

https://gerrit.wikimedia.org/r/431618

Change 431618 merged by Andrew Bogott:
[operations/puppet@production] Unbreak maintain_kubeusers.pp from dependency cycle

https://gerrit.wikimedia.org/r/431618

K8s is up!

toolsbeta.admin@toolsbeta-bastion-01:~$ webservice --backend kubernetes python shell
If you don't see a command prompt, try pressing enter.
toolsbeta.admin@interactive:~$ 
toolsbeta.admin@interactive:~$ echo Hello from Kubernetes\!
Hello from Kubernetes!
toolsbeta.admin@interactive:~$ logout
Session ended, resume using 'kubectl attach interactive -c interactive -i -t' command when the pod is running
Pod stopped. Session cannot be resumed.

Change 430647 merged by jenkins-bot:
[operations/software/tools-webservice@master] Mount & load project name dynamically from /etc/wmcs-project

https://gerrit.wikimedia.org/r/430647

Mentioned in SAL (#wikimedia-cloud) [2018-05-07T20:48:50Z] <zhuyifei1999_> building, signing, and publishing toollabs-webservice 0.39 T190893

Mentioned in SAL (#wikimedia-cloud) [2018-05-07T21:02:04Z] <zhuyifei1999_> re-building all docker images T190893

Got the k8s job running, but networking seems bugged out:

1toolsbeta.admin@toolsbeta-bastion-01:~$ webservice --backend kubernetes php5.6 start
2Starting webservice..
3toolsbeta.admin@toolsbeta-bastion-01:~$ kubectl get pods
4NAME READY STATUS RESTARTS AGE
5admin-1850377006-x00n5 1/1 Running 0 7s
6toolsbeta.admin@toolsbeta-bastion-01:~$ kubectl log admin-1850377006-x00n5
7W0507 21:52:41.743263 4014 cmd.go:345] log is DEPRECATED and will be removed in a future version. Use logs instead.
8toolsbeta.admin@toolsbeta-bastion-01:~$ kubectl logs admin-1850377006-x00n5
9toolsbeta.admin@toolsbeta-bastion-01:~$ webservice --backend kubernetes php5.6 shell
10If you don't see a command prompt, try pressing enter.
11toolsbeta.admin@interactive:~$
12toolsbeta.admin@interactive:~$ webservice status
13Traceback (most recent call last):
14 File "/usr/bin/webservice", line 182, in <module>
15 if job.get_state() != Backend.STATE_STOPPED:
16 File "/usr/lib/python2.7/dist-packages/toollabs/webservice/backends/kubernetesbackend.py", line 402, in get_state
17 pod = self._find_obj(pykube.Pod, self.webservice_label_selector)
18 File "/usr/lib/python2.7/dist-packages/toollabs/webservice/backends/kubernetesbackend.py", line 210, in _find_obj
19 selector=selector
20 File "/usr/lib/python2.7/dist-packages/pykube/query.py", line 70, in get
21 num = len(clone)
22 File "/usr/lib/python2.7/dist-packages/pykube/query.py", line 122, in __len__
23 return len(self.query_cache["objects"])
24 File "/usr/lib/python2.7/dist-packages/pykube/query.py", line 115, in query_cache
25 cache["response"] = self.execute().json()
26 File "/usr/lib/python2.7/dist-packages/pykube/query.py", line 99, in execute
27 r = self.api.get(**kwargs)
28 File "/usr/lib/python2.7/dist-packages/pykube/http.py", line 125, in get
29 return self.session.get(*args, **self.get_kwargs(**kwargs))
30 File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 501, in get
31 return self.request('GET', url, **kwargs)
32 File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 488, in request
33 resp = self.send(prep, **send_kwargs)
34 File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 609, in send
35 r = adapter.send(request, **kwargs)
36 File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 487, in send
37 raise ConnectionError(e, request=request)
38requests.exceptions.ConnectionError: HTTPSConnectionPool(host='toolsbeta-k8s-master-01.toolsbeta.eqiad.wmflabs', port=6443): Max retries exceeded with url: /api/v1/namespaces/admin/pods?labelSelector=tools.wmflabs.org%2Fwebservice-version%3D1%2Cname%3Dadmin%2Ctools.wmflabs.org%2Fwebservice%3Dtrue (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7fbe91f6ded0>: Failed to establish a new connection: [Errno -2] Name or service not known',))

flannel error? (T190893#4184614)

Mentioned in SAL (#wikimedia-cloud) [2018-05-08T02:18:30Z] <zhuyifei1999_> manually created flannel etcd key T190893

1zhuyifei1999@toolsbeta-worker-1001:~$ curl https://toolsbeta-flannel-etcd-01.toolsbeta.eqiad.wmflabs:2379/v2/keys/?recursive=true; echo
2{"action":"get","node":{"dir":true}}
3
4[A long time of digging around...]
5zhuyifei1999@toolsbeta-worker-1001:~$ curl https://toolsbeta-flannel-etcd-01.toolsbeta.eqiad.wmflabs:2379/v2/keys/coreos.com/network/config -XPUT -d value='{ "Network": "192.168.128.0/17", "Backend": { "Type": "vxlan" } }'
6{"action":"set","node":{"key":"/coreos.com/network/config","value":"{ \"Network\": \"192.168.128.0/17\", \"Backend\": { \"Type\": \"vxlan\" } }","modifiedIndex":5,"createdIndex":5}}
7zhuyifei1999@toolsbeta-worker-1001:~$ curl https://toolsbeta-flannel-etcd-01.toolsbeta.eqiad.wmflabs:2379/v2/keys/?recursive=true | jq .
8 % Total % Received % Xferd Average Speed Time Time Time Current
9 Dload Upload Total Spent Left Speed
10100 1267 100 1267 0 0 22778 0 --:--:-- --:--:-- --:--:-- 23036
11{
12 "action": "get",
13 "node": {
14 "dir": true,
15 "nodes": [
16 {
17 "key": "/coreos.com",
18 "dir": true,
19 "nodes": [
20 {
21 "key": "/coreos.com/network",
22 "dir": true,
23 "nodes": [
24 {
25 "key": "/coreos.com/network/config",
26 "value": "{ \"Network\": \"192.168.128.0/17\", \"Backend\": { \"Type\": \"vxlan\" } }",
27 "modifiedIndex": 5,
28 "createdIndex": 5
29 },
30 {
31 "key": "/coreos.com/network/subnets",
32 "dir": true,
33 "nodes": [
34 {
35 "key": "/coreos.com/network/subnets/192.168.215.0-24",
36 "value": "{\"PublicIP\":\"10.68.18.110\",\"BackendType\":\"vxlan\",\"BackendData\":{\"VtepMAC\":\"ce:b0:30:4d:62:48\"}}",
37 "expiration": "2018-05-09T02:17:23.548978467Z",
38 "ttl": 86392,
39 "modifiedIndex": 6,
40 "createdIndex": 6
41 },
42 {
43 "key": "/coreos.com/network/subnets/192.168.130.0-24",
44 "value": "{\"PublicIP\":\"10.68.20.72\",\"BackendType\":\"vxlan\",\"BackendData\":{\"VtepMAC\":\"ba:57:9d:85:c2:e4\"}}",
45 "expiration": "2018-05-09T02:17:23.982088555Z",
46 "ttl": 86392,
47 "modifiedIndex": 7,
48 "createdIndex": 7
49 },
50 {
51 "key": "/coreos.com/network/subnets/192.168.131.0-24",
52 "value": "{\"PublicIP\":\"10.68.22.202\",\"BackendType\":\"vxlan\",\"BackendData\":{\"VtepMAC\":\"fa:3b:36:2a:31:49\"}}",
53 "expiration": "2018-05-09T02:17:24.453518564Z",
54 "ttl": 86393,
55 "modifiedIndex": 8,
56 "createdIndex": 8
57 }
58 ],
59 "modifiedIndex": 6,
60 "createdIndex": 6
61 }
62 ],
63 "modifiedIndex": 5,
64 "createdIndex": 5
65 }
66 ],
67 "modifiedIndex": 5,
68 "createdIndex": 5
69 }
70 ]
71 }
72}

Let's see if I fixed flannel.

kube-proxy is still broken:

May 08 02:21:44 toolsbeta-worker-1001 kube-proxy[1566]: E0508 02:21:44.886056    1566 reflector.go:203] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: Get http://127.0.0.1:8080/api/v1/endpoints?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
May 08 02:21:44 toolsbeta-worker-1001 kube-proxy[1566]: E0508 02:21:44.886172    1566 reflector.go:203] pkg/proxy/config/api.go:30: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refuse

That 127.0.0.1:8080 is supposed to be toolsbeta-k8s-master-01.toolsbeta.eqiad.wmflabs:8080

I did a comparison with tools-worker-1001, and this file is missing on toolforge:

zhuyifei1999@toolsbeta-worker-1001:~$ cat /etc/kubernetes/config
###
# Kubernetes: common config for the following services:
##
#   kube-apiserver.service
#   kube-controller-manager.service
#   kube-scheduler.service
#   kubelet.service
#   kube-proxy.service
##

# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"

# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"

# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=false"

# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="--master=http://127.0.0.1:8080"
[...]
root@tools-worker-1001:~# cat /etc/kubernetes/config
cat: /etc/kubernetes/config: No such file or directory
root@tools-worker-1001:~# ls -al /etc/kubernetes/
total 16
drwxr-xr-x   3 root root 4096 Aug  9  2017 .
drwxr-xr-x 104 root root 4096 May  5 21:30 ..
-r--------   1 root root  351 Aug  9  2017 kubeconfig
dr-xr-xr-x   2 root root 4096 Jun 28  2017 ssl

I moved that file away via $ sudo mv /etc/kubernetes/config /etc/kubernetes/config.bak, and kube-proxy stopped complaining, but with lots of warnings:

May 08 03:02:33 toolsbeta-worker-1001 sudo[21477]: zhuyifei1999 : TTY=pts/1 ; PWD=/mnt/nfs/labstore-secondary-home/zhuyifei1999 ; USER=root ; COMMAND=/bin/systemctl restart kube-proxy.service
May 08 03:02:33 toolsbeta-worker-1001 sudo[21477]: pam_unix(sudo:session): session opened for user root by zhuyifei1999(uid=0)
May 08 03:02:33 toolsbeta-worker-1001 systemd[1]: Stopping Kubernetes Kube-Proxy Server...
-- Subject: Unit kube-proxy.service has begun shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kube-proxy.service has begun shutting down.
May 08 03:02:33 toolsbeta-worker-1001 systemd[1]: Starting Kubernetes Kube-Proxy Server...
-- Subject: Unit kube-proxy.service has begun with start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kube-proxy.service has begun starting up.
May 08 03:02:33 toolsbeta-worker-1001 systemd[1]: Started Kubernetes Kube-Proxy Server.
-- Subject: Unit kube-proxy.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kube-proxy.service has finished starting up.
-- 
-- The start-up result is done.
May 08 03:02:33 toolsbeta-worker-1001 sudo[21477]: pam_unix(sudo:session): session closed for user root
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: W0508 03:02:33.611583   21481 server.go:378] Flag proxy-mode="'iptables'" unknown, assuming iptables proxy
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: I0508 03:02:33.613700   21481 server.go:203] Using iptables Proxier.
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: W0508 03:02:33.767666   21481 server.go:436] Failed to retrieve node info: nodes "toolsbeta-worker-1001" not found
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: W0508 03:02:33.768331   21481 proxier.go:226] invalid nodeIP, initialize kube-proxy with 127.0.0.1 as nodeIP
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: I0508 03:02:33.768624   21481 server.go:215] Tearing down userspace rules.
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: I0508 03:02:33.791948   21481 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: I0508 03:02:33.792479   21481 conntrack.go:66] Setting conntrack hashsize to 32768
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: I0508 03:02:33.792592   21481 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
May 08 03:02:33 toolsbeta-worker-1001 kube-proxy[21481]: I0508 03:02:33.792609   21481 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600

Networking is still broken inside the pod:

toolsbeta.admin@toolsbeta-bastion-01:~$ webservice --backend kubernetes php5.6 shell
If you don't see a command prompt, try pressing enter.
toolsbeta.admin@interactive:~$ curl www.google.com
curl: (6) Could not resolve host: www.google.com

I'm might be out of ideas (thinking of other difficult k8s networking bugs like T182722 T120561). Anyone got an idea?

I followed T182722#3834172 and saw a similar behavior of docker internal IPs being forwarded to DNS without NAT being applied:

zhuyifei1999@toolsbeta-worker-1001:~$ sudo tcpdump -i eth0 host labs-recursor0.wikimedia.org
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:32:07.240151 IP 172.17.0.2.46800 > labs-recursor0.wikimedia.org.domain: 63168+ A? www.google.com.toolsbeta.eqiad.wmflabs. (56)
23:32:07.240266 IP 172.17.0.2.46800 > labs-recursor0.wikimedia.org.domain: 62118+ AAAA? www.google.com.toolsbeta.eqiad.wmflabs. (56)

This IP is unexpectedly 172.17.0.2, instead of 192.168.*.* as in the linked comment. Indeed, the IP configuration for docker0 is broken:

root@tools-worker-1001:~# ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:a5:2f:f4:e3  
          inet addr:192.168.168.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::42:a5ff:fe2f:f4e3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:44098063 errors:0 dropped:0 overruns:0 frame:0
          TX packets:47829857 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:28832505727 (26.8 GiB)  TX bytes:109367978093 (101.8 GiB)
[...]
zhuyifei1999@toolsbeta-worker-1001:~$ sudo ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:b4:58:35:c6  
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:b4ff:fe58:35c6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:149 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:10276 (10.0 KiB)  TX bytes:1032 (1.0 KiB)

Googled for how the IP is configured, and grep-ed for the dockerd command link args, and surprise, docker is started with different command line args:

root@tools-worker-1001:~# ps auxfww | grep dockerd
root     11311  0.0  0.0  12728  2240 pts/0    S+   23:36   0:00          \_ grep dockerd
root      3495  1.3  0.8 2704132 66704 ?       Ssl  Mar19 984:28 dockerd -H fd:// --config-file=/etc/docker/daemon.json --bip=192.168.168.1/24 --mtu=1450
[...]
zhuyifei1999@toolsbeta-worker-1001:~$ ps auxfww | grep dockerd
zhuyife+  3607  0.0  0.0  12728  2252 pts/0    S+   23:36   0:00  |           \_ grep dockerd
root     27465  0.2  1.1 606276 48460 ?        Ssl  21:47   0:19 /usr/bin/dockerd -H fd://

So is the systemd unit the same? No...

1root@tools-worker-1001:~# systemctl cat docker.service
2# /lib/systemd/system/docker.service
3[Unit]
4Description=Docker Application Container Engine
5Documentation=https://docs.docker.com
6After=network.target docker.socket
7Requires=docker.socket
8
9[Service]
10Type=notify
11# the default is not to use systemd for cgroups because the delegate issues still
12# exists and systemd currently does not support the cgroup feature set required
13# for containers run by docker
14ExecStart=/usr/bin/dockerd -H fd://
15ExecReload=/bin/kill -s HUP $MAINPID
16# Having non-zero Limit*s causes performance problems due to accounting overhead
17# in the kernel. We recommend using cgroups to do container-local accounting.
18LimitNOFILE=infinity
19LimitNPROC=infinity
20LimitCORE=infinity
21# Uncomment TasksMax if your systemd version supports it.
22# Only systemd 226 and above support this version.
23#TasksMax=infinity
24TimeoutStartSec=0
25# set delegate yes so that systemd does not reset the cgroups of docker containers
26Delegate=yes
27# kill only the docker process, not all processes in the cgroup
28KillMode=process
29
30[Install]
31WantedBy=multi-user.target
32
33# /etc/systemd/system/docker.service.d/puppet-override.conf
34# Docker override systemd for v1.11.2-0~jessie
35[Unit]
36After=network.target docker.socket flannel.service
37Requires=docker.socket flannel.service
38
39[Service]
40EnvironmentFile=/run/flannel/subnet.env
41# We need to clear ExecStart first before setting it again
42ExecStart=
43ExecStart=/usr/bin/docker daemon -H fd:// \
44 --config-file=/etc/docker/daemon.json \
45 --bip=${FLANNEL_SUBNET} \
46 --mtu=${FLANNEL_MTU}
47[...]
48zhuyifei1999@toolsbeta-worker-1001:~$ systemctl cat docker.service
49# /lib/systemd/system/docker.service
50[Unit]
51Description=Docker Application Container Engine
52Documentation=https://docs.docker.com
53After=network.target docker.socket
54Requires=docker.socket
55
56[Service]
57Type=notify
58# the default is not to use systemd for cgroups because the delegate issues still
59# exists and systemd currently does not support the cgroup feature set required
60# for containers run by docker
61ExecStart=/usr/bin/dockerd -H fd://
62ExecReload=/bin/kill -s HUP $MAINPID
63# Having non-zero Limit*s causes performance problems due to accounting overhead
64# in the kernel. We recommend using cgroups to do container-local accounting.
65LimitNOFILE=infinity
66LimitNPROC=infinity
67LimitCORE=infinity
68# Uncomment TasksMax if your systemd version supports it.
69# Only systemd 226 and above support this version.
70#TasksMax=infinity
71TimeoutStartSec=0
72# set delegate yes so that systemd does not reset the cgroups of docker containers
73Delegate=yes
74# kill only the docker process, not all processes in the cgroup
75KillMode=process
76
77[Install]
78WantedBy=multi-user.target

/etc/systemd/system/docker.service.d/puppet-override.conf doesn't exist on toolsbeta but exists on toolforge.

ok I think I figured out the cause:

How it should work: role::manifests::toollabs::k8s::worker -> profile::docker::flannel -> base::service_unit -> file

The regression happed in ops/puppet commit a504c49. In particular:

diff --git a/modules/profile/manifests/docker/flannel.pp b/modules/profile/manifests/docker/flannel.pp
index 4485001..89510b1 100644
--- a/modules/profile/manifests/docker/flannel.pp
+++ b/modules/profile/manifests/docker/flannel.pp
@@ -3,13 +3,12 @@ class profile::docker::flannel(
     # to the version in use.
     $docker_version = hiera('profile::flannel::docker_version'),
 ) {
+    # TODO: convert to systemd::service
     base::service_unit { 'docker':
         ensure           => present,
-        systemd          => true,
-        systemd_override => true,
+        systemd_override => init_template("docker/flannel/docker_${docker_version}", 'systemd_override'),
         # Restarts must always be manual, since restart
         # destroy all running containers. Fuck you, Docker.
         refresh          => false,
-        template_name    => "docker/flannel/docker_${docker_version}",
     }
 }

The systemd parameter is no longer set, and now base::service_unit calls pick_initscript with has_systemd set to false:

$initscript = pick_initscript(
    $name, $::initsystem, !empty($systemd), !empty($systemd_override), !empty($upstart),
    !empty($sysvinit), $strict)

pick_initscript then figures that 'we don't have custom scripts' and 'we use the system defaults':

has_custom = (has_systemd || has_upstart || has_sysvinit)
# if we don't have custom scripts, we use the system defaults
return false unless has_custom

The entire logic in base::service_unit is then skipped.

Given that, prior to that above patch, some services had systemd set to false but systemd_override set to true, I'm not sure having pick_initscript treat has_systemd_override in has_custom won't cause troubles. Advice @Joe @akosiaris?

ok I think I figured out the cause:
The regression happed in ops/puppet commit a504c49. In particular:

diff --git a/modules/profile/manifests/docker/flannel.pp b/modules/profile/manifests/docker/flannel.pp
index 4485001..89510b1 100644
--- a/modules/profile/manifests/docker/flannel.pp
+++ b/modules/profile/manifests/docker/flannel.pp
@@ -3,13 +3,12 @@ class profile::docker::flannel(
     # to the version in use.
     $docker_version = hiera('profile::flannel::docker_version'),
 ) {
+    # TODO: convert to systemd::service
     base::service_unit { 'docker':
         ensure           => present,
-        systemd          => true,
-        systemd_override => true,
+        systemd_override => init_template("docker/flannel/docker_${docker_version}", 'systemd_override'),
         # Restarts must always be manual, since restart
         # destroy all running containers. Fuck you, Docker.
         refresh          => false,
-        template_name    => "docker/flannel/docker_${docker_version}",
     }
 }

The way pick_initscript is coded, it seems quite clear we need to set systemd => true in base::service_unit { 'docker': , I guess this was just an error on my part when I made that huge conversion back in the day. Sorry for the inconvenience; definitely add systemd => true back.

This hapened because that specific class is not applied anywhere in production so I couldn't catch the error with a full production catalog compliation

The way pick_initscript is coded, it seems quite clear we need to set systemd => true in base::service_unit { 'docker':

Now that, after the patch, the parameter systemd parameter accept a string as the contents of the systemd unit file, and not a boolean, what shall it be set to?

An empty string should do the trick; or (better) you could convert that whole thing to use systemd::service instead, as proposed in the TODO.

FWIW, I think this is a bug in pick_initscript, but I intend to deprecate/retire base::service_unit as soon as trusty is gone in production, it was always intended as a bridge solution while we moved to greener pastures as it had to accomodate various different scenarios. I don't think we should invest time in fixing it, at the moment.

Change 433101 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/puppet@production] profile::docker::flannel: Use systemd::service

https://gerrit.wikimedia.org/r/433101

Mentioned in SAL (#wikimedia-cloud) [2018-05-15T07:26:33Z] <zhuyifei1999_> applied rOPUP532423612fcd via toolsbeta-puppetmaster-01 T190893

The docker systemd unit now lgtm:

zhuyifei1999@toolsbeta-worker-1001:~$ sudo systemctl cat docker
# /lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target docker.socket
Requires=docker.socket

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd://
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/docker.service.d/puppet-override.conf
# Docker override systemd for v1.11.2-0~jessie
[Unit]
After=network.target docker.socket flannel.service
Requires=docker.socket flannel.service

[Service]
EnvironmentFile=/run/flannel/subnet.env
# We need to clear ExecStart first before setting it again
ExecStart=
ExecStart=/usr/bin/docker daemon -H fd:// \
            --config-file=/etc/docker/daemon.json \
            --bip=${FLANNEL_SUBNET} \
            --mtu=${FLANNEL_MTU}

Let's see if it works

Uh no... it hates the certs from standalone puppetmaster:

May 15 07:36:52 toolsbeta-worker-1001 kube-proxy[27594]: E0515 07:36:52.476971   27594 reflector.go:203] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: Get https://toolsbeta-k8s-master-01.toolsbeta.eqiad.wmflabs:6443/api/v1/endpoints?resourceVersion=0: x509: certificate signed by unknown authority

Uh no... it hates the certs from standalone puppetmaster:

Fixed by restarting kube-proxy across affected instances.

After restarting many services across various instances (mostly to update the certs), networking is now working inside k8s pods:

toolsbeta.test@interactive:~$ curl www.google.com
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for." name="description">[...]
zhuyifei1999 closed this task as Resolved.EditedMay 15 2018, 8:41 AM

On k8s: (once the above patch is merged we shouldn't need a standalone puppetmaster for new-built instances)
https://tools-beta.wmflabs.org/
https://tools-beta.wmflabs.org/test/hello.txt

zhuyifei1999@toolsbeta-k8s-master-01:~$ kubectl get pods --all-namespaces
NAMESPACE   NAME                     READY     STATUS    RESTARTS   AGE
admin       admin-1850377006-8fwn2   1/1       Running   0          2m
test        test-1089897043-20n1d    1/1       Running   0          2m
zhuyifei1999@toolsbeta-k8s-master-01:~$ kubectl get deployments --all-namespaces
NAMESPACE   NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
admin       admin     1         1         1            1           9m
test        test      1         1         1            1           10m
zhuyifei1999@toolsbeta-k8s-master-01:~$ kubectl get services --all-namespaces
NAMESPACE   NAME         CLUSTER-IP        EXTERNAL-IP   PORT(S)    AGE
admin       admin        192.168.17.205    <none>        8000/TCP   9m
default     kubernetes   192.168.0.1       <none>        443/TCP    7d
test        test         192.168.124.202   <none>        8000/TCP   10m
zhuyifei1999@toolsbeta-k8s-master-01:~$ kubectl get nodes
NAME                                            STATUS    AGE       VERSION
toolsbeta-worker-1001.toolsbeta.eqiad.wmflabs   Ready     7d        v1.4.6+e569a27

On grid: (only configured to have one slot per queue though)
https://tools-beta.wmflabs.org/test2/hello.txt

zhuyifei1999@toolsbeta-grid-master:~$ qstat -f
queuename                      qtype resv/used/tot. load_avg arch          states
---------------------------------------------------------------------------------
continuous@toolsbeta-exec-1401 BIP   0/0/1          0.01     lx26-amd64    
---------------------------------------------------------------------------------
task@toolsbeta-exec-1401.tools BIP   0/0/1          0.01     lx26-amd64    
---------------------------------------------------------------------------------
webgrid-generic@toolsbeta-webg BIP   0/0/1          0.01     lx26-amd64    
---------------------------------------------------------------------------------
webgrid-lighttpd@toolsbeta-web BIP   0/1/1          0.01     lx26-amd64    
zhuyifei1999@toolsbeta-grid-master:~$ qhost -j -h toolsbeta-webgrid-lighttpd-1401
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
toolsbeta-webgrid-lighttpd-1401.toolsbeta.eqiad.wmflabs lx26-amd64      2  0.01    3.9G  326.1M  488.0M     0.0
   job-ID  prior   name       user         state submit/start at     queue      master ja-task-ID 
   ----------------------------------------------------------------------------------------------
        12 0.50000 lighttpd-t toolsbeta.te r     05/09/2018 15:06:52 webgrid-li MASTER

Hooray!

Change 433101 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] profile::docker::flannel: Use systemd::service

https://gerrit.wikimedia.org/r/433101