Page MenuHomePhabricator

legoktm can't deploy docker images on contint1001
Closed, ResolvedPublic

Description

I can't deploy docker images on contint1001 with docker-pkg. I first tried using the fab deploy_docker command which tries to run it with sudo, except I don't have the right permissions for that;

legoktm@contint1001:~$ sudo /srv/deployment/docker-pkg/venv/bin/docker-pkg -c /etc/docker-pkg/integration.yaml /etc/zuul/wikimedia/dockerfiles

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

    #1) Respect the privacy of others.
    #2) Think before you type.
    #3) With great power comes great responsibility.

[sudo] password for legoktm:

Then I tried sudoing as the zuul user:

legoktm@contint1001:/tmp$ sudo -u zuul /srv/deployment/docker-pkg/venv/bin/docker-pkg -c /etc/docker-pkg/integration.yaml /etc/zuul/wikimedia/dockerfiles
Traceback (most recent call last):
  File "/srv/deployment/docker-pkg/venv/bin/docker-pkg", line 11, in <module>
    sys.exit(main())
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker_pkg/cli.py", line 51, in main
    config = read_config(args.configfile)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker_pkg/cli.py", line 30, in read_config
    with open(configfile, 'rb') as fh:
PermissionError: [Errno 13] Permission denied: '/etc/docker-pkg/integration.yaml'

That file is only readable by contint-admins, ok... so I just tried running it as my own user:

legoktm@contint1001:~$ /srv/deployment/docker-pkg/venv/bin/docker-pkg -c /etc/docker-pkg/integration.yaml /etc/zuul/wikimedia/dockerfiles
Traceback (most recent call last):
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.4/http/client.py", line 1090, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.4/http/client.py", line 1128, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.4/http/client.py", line 1086, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.4/http/client.py", line 924, in _send_output
    self.send(msg)
  File "/usr/lib/python3.4/http/client.py", line 859, in send
    self.connect()
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/transport/unixconn.py", line 33, in connect
    sock.connect(self.unix_socket)
PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/util/retry.py", line 357, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.4/http/client.py", line 1090, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.4/http/client.py", line 1128, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.4/http/client.py", line 1086, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.4/http/client.py", line 924, in _send_output
    self.send(msg)
  File "/usr/lib/python3.4/http/client.py", line 859, in send
    self.connect()
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/transport/unixconn.py", line 33, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', PermissionError(13, 'Permission denied'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/api/client.py", line 166, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/api/daemon.py", line 177, in version
    return self._result(self._get(url), json=True)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/api/client.py", line 189, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/requests/sessions.py", line 521, in get
    return self.request('GET', url, **kwargs)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/requests/adapters.py", line 490, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', PermissionError(13, 'Permission denied'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/deployment/docker-pkg/venv/bin/docker-pkg", line 11, in <module>
    sys.exit(main())
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker_pkg/cli.py", line 52, in main
    build = builder.DockerBuilder(args.directory, config)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker_pkg/builder.py", line 103, in __init__
    self.client = docker.from_env(version='auto', timeout=600)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/client.py", line 80, in from_env
    **kwargs_from_env(**kwargs))
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/client.py", line 37, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/api/client.py", line 147, in __init__
    self._version = self._retrieve_server_version()
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/api/client.py", line 174, in _retrieve_server_version
    'Error while fetching server API version: {0}'.format(e)
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))

So which permission am I missing? Or am I totally doing this wrong? :/

Event Timeline

Legoktm triaged this task as High priority.Feb 5 2018, 12:04 AM
Legoktm created this task.

I think we need to add other members of contint-admins to the contint-docker group to ensure that they are able to upload docker images created by docker-pkg on contint1001: https://github.com/wikimedia/puppet/blob/production/modules/admin/data/data.yaml#L657

This is related to: https://phabricator.wikimedia.org/T182860 -- adding @akosiaris for more context.

Change 408525 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Add legoktm to contint-docker admins

https://gerrit.wikimedia.org/r/408525

Change 408525 merged by Alexandros Kosiaris:
[operations/puppet@production] Add legoktm to contint-docker admins

https://gerrit.wikimedia.org/r/408525

akosiaris claimed this task.

I am thinking this is resolved now. Don't forget to logout + login to make sure you got the new group.

@akosiaris thank you! Should we add the entire contint-admins team to contint-docker? I think people like @Addshore are also going to need to be able to deploy docker images for example.

@akosiaris thank you! Should we add the entire contint-admins team to contint-docker? I think people like @Addshore are also going to need to be able to deploy docker images for example.

Indeed, I expect I am going to run into this exact issue sooner rather than later.

@akosiaris thank you! Should we add the entire contint-admins team to contint-docker?

Let's not and err on the side of caution and on the principle of least privilege.

I think people like @Addshore are also going to need to be able to deploy docker images for example.

I just noticed. Don't you mean build instead of deploy ?

Indeed, I expect I am going to run into this exact issue sooner rather than later.

Yes, this sounds fine.

I think people like @Addshore are also going to need to be able to deploy docker images for example.

I just noticed. Don't you mean build instead of deploy ?

I expect to be able to run docker-pkg which builds and pushes the images to docker-registry.wm.o ("deploy" in my mind).

I think people like @Addshore are also going to need to be able to deploy docker images for example.

I just noticed. Don't you mean build instead of deploy ?

I expect to be able to run docker-pkg which builds and pushes the images to docker-registry.wm.o ("deploy" in my mind).

I stand corrected. I meant build AND push.

It's not a deploy though technically. In no way is a pushed image also assumed to be executed, which AFAICT is what a "deploy" means.

Both @Legoktm and @Addshore already have the privileges to run privileged code on CI and can really run any random Docker image.

The bits they are missing though is the ability to get the image build on contint1001 and published on docker registry. For the rest of the chain (merging a Dockerfile template, changing CI to use a new image) they have all adequate privileges.

I am in favor of granting both of them the ability to build and push to the registry. I am assuming though the Docker registry credentials are limited to the /releng/ registry namespace.

Both @Legoktm and @Addshore already have the privileges to run privileged code on CI and can really run any random Docker image.

The bits they are missing though is the ability to get the image build on contint1001 and published on docker registry. For the rest of the chain (merging a Dockerfile template, changing CI to use a new image) they have all adequate privileges.

Yes, fully agreed.

I am in favor of granting both of them the ability to build and push to the registry.

That makes at least 2 of us.

I am assuming though the Docker registry credentials are limited to the /releng/ registry namespace.

That's an assumption we need to work a bit on so that it becomes reality as well.

Change 408823 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Add addshore to contint-docker admins

https://gerrit.wikimedia.org/r/408823

Change 408823 merged by Alexandros Kosiaris:
[operations/puppet@production] Add addshore to contint-docker admins

https://gerrit.wikimedia.org/r/408823