Page MenuHomePhabricator

Replace the nslcd mount in containers from the old Toolforge cluster with something that will work with sssd in the new one
Closed, ResolvedPublic

Description

Currently nslcd sockets are mounted in kubernetes containers in Toolforge in order to provide permissions for NFS and so forth. That mechanism won't work under sssd, and it is somewhat fragile anyway (see T166949).

Find a way to replace either the mechanism or the need for it in the new cluster we are building.

As was discovered in T224558, current toolforge K8s nodes use LDAP and some restrictions using admission controllers to restrict k8s users to their actual LDAP accounts which is not terribly possible with sssd since we mount the nslcd socket currently. Since the new cluster is going to be on Debian Buster, and we are trying to not use nscd/nslcd (and are using security contexts and PSPs instead of admission controllers), it may be possible to deliberately not use LDAP.

Related Objects

Event Timeline

Bstorm triaged this task as Medium priority.Jul 25 2019, 8:22 PM
Bstorm created this task.

This is partly to enable usage of shared filesystems. That may just be a matter of specifying what user you connect to volumes as. Also, this could be useful (maybe not) https://kubernetes.io/docs/tasks/configure-pod-container/security-context/

It is possible to access LDAP inside a container in general, but that sounds heavy as hell (likely where this will end up, though).

Interesting possibility: a numeric user ID doesn't have to exist in a container for it to be used. Docker will just accept it.

And from there, through PSPs we can force any pods to use a security context that sets the UID and GID to the actual tool one inside the pod. That might honestly take care of most of this, with a few env vars set. I'd rather remove parts of OSs from containers than add more of them.

I can confirm that a persistent volume claim on an NFS server running outside of kubernetes works great and does respect the runas security contexts.

Mounted in this way, rebooting the NFS server didn't really cause noticeable failure in the pods. So that's good.

So php functions to determine the current user will absolutely fail when using this method, on the other hand. It seems like other orgs deliberately avoid doing this kind of thing in docker because docker.

So php functions to determine the current user will absolutely fail when using this method, on the other hand. It seems like other orgs deliberately avoid doing this kind of thing in docker because docker.

This makes me think that we are going to need to put sssd into the containers. I can imagine lots and lots of ways to not need this for green field development, but in Toolforge I would guess that every webservice running inside Kubernetes today uses some variation of a getuid lookup to find its $HOME/replica.my.cnf file.

Yeah, I just figured out the general way to do it. It's just a different socket to mount into the containers. Hopefully it is the only host mount needed. Host mounts when not needed are the devil.

It's just a different socket to mount into the containers.

That doesn't give me great hope that this will also solve T166949: Homedir/UID info breaks after a while in Tools Kubernetes (can't read replica.my.cnf), but maybe it will. I guess that depends on what happens on an sssd process restart.

Hopefully it is the only host mount needed. Host mounts when not needed are the devil.

There are several others today (as I'm sure you know):

Volumes:
  dumps:
    Type:       HostPath (bare host directory volume)
    Path:       /public/dumps/
  home:
    Type:       HostPath (bare host directory volume)
    Path:       /data/project/
  wmcs-project:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/wmcs-project
  scratch:
    Type:       HostPath (bare host directory volume)
    Path:       /data/scratch/
  etcldap-conf-w4knf:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/ldap.conf
  etcldap-yaml-tjaxc:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/ldap.yaml
  etcnovaobserver-yaml-7fucj:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/novaobserver.yaml
  varrunnslcdsocket-guyur:
    Type:       HostPath (bare host directory volume)
    Path:       /var/run/nslcd/socket

Those are all going away :)

After kubernetes 1.10, the whole idea is to move to volume claim mounts. You make a readwritemany persisistent volume claim against NFS. The host doesn't even need to mount NFS.

Novaserver-yaml should be replaced with a service account, etc.

Novaserver-yaml should be replaced with a service account, etc.

/etc/novaobserver.yaml is there not to make anything in the container work per se. Its there to allow folks to write tools like openstack-browser (which ironically doesn't use that file yet). Similarly /etc/wmcs-project and /etc/ldap.yaml are mounted from the host to expose data to the webservice code that may or may not actually ever be used.

These might all be things that we can somehow be replaced with ConfigMaps? Might need some symlinks baked into the base images to keep the same /etc/... paths working.

Ah ok! Yes, config maps that are mounted into the container would do it.

If we are clever, it might be possible to make them simulate the same paths.

Huh, so one of the big issues is that /usr/bin/webservice-runner also expects basically one to be running on a host-type environment. It seems that things are overall looking somewhat hastily ported from the grid to kubernetes. That makes a lot of sense. I'm testing things with our current images. I think we'll need new images to work with sssd so far.

Yup. Confirmed that we need packages installed to use sssd from inside the container. The need for different images to upgrade all this is an interesting and fresh new complication.

Change 527258 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/docker-images/toollabs-images@master] sssd: Add some new images to test sssd in containers

https://gerrit.wikimedia.org/r/527258

Change 528178 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/docker-images/toollabs-images@master] docker: add support for "stable" and "testing" tags in addition to latest

https://gerrit.wikimedia.org/r/528178

Change 527258 abandoned by Bstorm:
sssd: Add some new images to test sssd in containers

Reason:
I prefer this approach here I928ebe4fe4f8f0be5d85e95c56edbc99fe72058b

https://gerrit.wikimedia.org/r/527258

Change 528178 merged by jenkins-bot:
[operations/docker-images/toollabs-images@master] docker: add support for "testing" tags

https://gerrit.wikimedia.org/r/528178

So it appears the safest way to really test sssd is still to do something like https://gerrit.wikimedia.org/r/c/operations/docker-images/toollabs-images/+/527258
Then build the image and tag it with testing. I can test things locally, but I suspect it is easier to do this and actually test it on tools-worker-1029 (which is still in place as a jessie sssd test node). To do it locally, have to make minikube work with sssd which sounds like a lot of fussing.

Change 534704 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/docker-images/toollabs-images@master] sssd: Add some new images to test sssd in containers

https://gerrit.wikimedia.org/r/534704

Change 534846 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/docker-images/toollabs-images@master] tagging: Add the tag to the templates

https://gerrit.wikimedia.org/r/534846

Change 534846 merged by Bstorm:
[operations/docker-images/toollabs-images@master] tagging: Add the tag to the templates

https://gerrit.wikimedia.org/r/534846

Change 534704 merged by Bstorm:
[operations/docker-images/toollabs-images@master] sssd: Add some new images to test sssd in containers

https://gerrit.wikimedia.org/r/534704

The docker-registry.tools.wmflabs.org/toollabs-python35-sssd-web:testing image worked today in testing on tools-worker-1029.tools.eqiad.wmflabs (which is cordoned and runs sssd).

The test command I've used to run the container was:
docker run -it --user=18713:500 --env HOME=/home/bstorm -v=/var/lib/sss/pipes/:/var/lib/sss/pipes:rw -v=/home/bstorm:/home/bstorm:rw docker-registry.tools.wmflabs.org/toollabs-python35-sssd-web:testing /bin/bash

Tested getent lookups of tool groups and users. Tested python3 os.getuid(), os.path.expanduser('~'). Also tested basic "id". It is communicating with LDAP using sssd inside the container.

Change 536692 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/docker-images/toollabs-images@master] sssd: Add a whole duplicate hierarchy of sssd images

https://gerrit.wikimedia.org/r/536692

The docker-registry.tools.wmflabs.org/toollabs-python35-sssd-web:testing image worked today in testing on tools-worker-1029.tools.eqiad.wmflabs (which is cordoned and runs sssd).

The test command I've used to run the container was:
docker run -it --user=18713:500 --env HOME=/home/bstorm -v=/var/lib/sss/pipes/:/var/lib/sss/pipes:rw -v=/home/bstorm:/home/bstorm:rw docker-registry.tools.wmflabs.org/toollabs-python35-sssd-web:testing /bin/bash

Tested getent lookups of tool groups and users. Tested python3 os.getuid(), os.path.expanduser('~'). Also tested basic "id". It is communicating with LDAP using sssd inside the container.

great work! thanks @Bstorm !

Change 536692 merged by Bstorm:
[operations/docker-images/toollabs-images@master] sssd: Add a whole duplicate hierarchy of sssd images

https://gerrit.wikimedia.org/r/536692

Now there's a matter of webservice tooling for the new cluster.

I think this is done. We can re-open if we turn out to be wrong in the future.