Setup NSS inside containers used in Tool Labs
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	yuvipanda
	May 9 2016, 11:53 AM

Description

Containers need a NSS config that contacts the labs LDAP for user / group information. This is required because:

Our cluster enforces that tools must run as a specific UID that's associated with their LDAP account. This is both to protect against issues when tools run as root inside containers, and to make NFS permissions work ok.
There is no user entry for this UID / GID inside the container (/etc/passwd, /etc/group, etc). This will cause programs that attempt to get the name of current user (Which is a lot of them) to crash

Figure out what is the appropriate NSS configuration to use inside containers, as well as how to best refresh and redeploy them.

Options include:

Bake them into the container. This is simplest, but then rebuild and redeploy can take a while when needed
Write the config out with puppet in the k8s worker nodes, mount it readonly by default with an admission controller
Something else.

(1) might be the simplest / right thing to do, but it'll make our containers useless outside of labs environment. (2) is a bit ugly but very effective, and decouples container building from our environment specific stuff. (3) could be ConfigMap or similar alternative, but I am not too sure those will work in a reasonably foolproof manner.

Details

Subject	Repo	Branch	Lines +/-
tools: Enable host automounts	operations/puppet	production	+7 -2
k8s: Actually enable host automounter	operations/puppet	production	+1 -1
Add toollabs base container	operations/docker-images/toollabs-images	master	+33 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	yuvipanda	T129309 Goal: Allow using k8s instead of GridEngine as a backend for webservices
Resolved	yuvipanda	T130668 Build containers for use with Tool Labs
Resolved	yuvipanda	T134748 Setup NSS inside containers used in Tool Labs

Event Timeline

yuvipanda created this task.May 9 2016, 11:53 AM

Restricted Application added a subscriber: Zppix. · View Herald TranscriptMay 9 2016, 11:53 AM

MoritzMuehlenhoff subscribed.May 9 2016, 11:54 AM

yuvipanda added a parent task: T130668: Build containers for use with Tool Labs.May 9 2016, 11:55 AM

This should also run without nscd / nslcd caching daemons.

I realized after reading through more documentaiton that I don't actually want PAM but I want NSS.

yuvipanda renamed this task from Setup PAM inside containers used in Tool Labs to Setup NSS inside containers used in Tool Labs.May 10 2016, 7:56 AM

yuvipanda updated the task description. (Show Details)

It's actually NSS, not PAM.

After some experimentation, libnss-ldapd which is the recommended setup, works *almost flawlessly* out of the box, except for the fact that it requires nslcd be running :( This is unideal, since we'd like one container to have only logical process. We'd have to have some form of init process in each container to manage nslcd, and resource accounting becomes problematic as well. I'd very much prefer to not do this.

The other option is libnss-ldap, which is older, buggier, doesn't seem to support all the features of libnss-ldapd, but doesn't require a daemon. @MoritzMuehlenhoff explicitly recommended against it, so I'd prefer to not use it, but it does work without requiring a daemon for most cases (doesn't support multiple groups per user it looks like. Maybe that's just a configuration issue?). we also use libnss-ldapd in labs already, so this would be a confusing change-up.

Our options realistically are:

Figure out a way to run libnss-ldapd in containers without requiring the nslcd daemon
Run nslcd one per worker node via a kubernetes daemonset (which were designed for use cases like this), and mount the socket on all containers with an admission controller. This should be ok if we are comfortable with the security implications of this.
Something else not listed here, hackier or nicer.

Ideally we could do (1), but barring that, I'm leaning towards (2) mostly because I can't think of (3)

If we go with (2) we've to somehow configure libnss-ldapd to not try to install nslcd itself but to just talk to the appropriate socket.

The data presented by nslcd is identical to all hosts, so exploring (2) seems best to me.

After more discussion with @MoritzMuehlenhoff the options for following (2) are:

Patch libnss-ldapd source package to build a libnss-ldapd-plain binary package, and remember to keep forward porting this all new releases. This can just go into one of our deb repos.
Make a custom local build of libnss-ldapd without the nslcd dependency, and just dpkg -i it in the container. This is simpler than (1) but still complex.
Just install libnss-ldapd as is in all containers. nslcd will be installed but won't start anyway. We'll pay disk cost of the nslcd binary, but that isn't probably much

(3) is the one with the least amount of custom long term sustainining effort needed from us, so I think we should try to do that :D

Change 288464 had a related patch set uploaded (by Yuvipanda):
Add toollabs base container

https://gerrit.wikimedia.org/r/288464

gerritbot added a project: Patch-For-Review.May 12 2016, 7:57 PM

Change 288464 merged by Yuvipanda:
Add toollabs base container

https://gerrit.wikimedia.org/r/288464

yuvipanda claimed this task.May 14 2016, 3:31 AM

Change 288761 had a related patch set uploaded (by Yuvipanda):
tools: Enable host automounts

https://gerrit.wikimedia.org/r/288761

Change 288762 had a related patch set uploaded (by Yuvipanda):
k8s: Actually enable host automounter

https://gerrit.wikimedia.org/r/288762

Change 288762 abandoned by Yuvipanda:
k8s: Actually enable host automounter

https://gerrit.wikimedia.org/r/288762

We have a fairly decent solution for this now. We've setup libnss-ldapd, and nslcd won't start by default because we've suppressed autostart of packages anyway. We've written a k8s admission controller that allows us to automatically mount all containers with specific paths from the host, and configured it to automount /var/run/nslcd/socket. This works fine now for all containers building off of docker-registry.tools.wmflabs.org/jessie-toollabs.

Need to figure out if we need nscd.

We do need nscd, otherwise it is too slow :(

Change 288761 merged by Yuvipanda:
tools: Enable host automounts

https://gerrit.wikimedia.org/r/288761

yuvipanda mentioned this in rOPUPa43a3e0eb577: tools: Enable host automounts.May 18 2016, 6:16 PM

valhallasw moved this task from Backlog to In Progress on the Toolforge board.May 27 2016, 1:05 PM

yuvipanda mentioned this in rOPUPf0a7c817dbc8: tools: Enable host automounts.Jun 17 2016, 6:07 PM

yuvipanda mentioned this in rOPUPc6ed54f883b0: tools: Enable host automounts.

yuvipanda mentioned this in rOPUPad738cdcbb98: k8s: Actually enable host automounter.

yuvipanda mentioned this in rOPUP125c8107d503: tools: Enable host automounts.

yuvipanda mentioned this in rOPUP0ee3b2f13725: tools: Enable host automounts.

scfc removed a project: Patch-For-Review.Nov 26 2016, 11:32 PM