Page MenuHomePhabricator

Provision CI:admins ssh public key in Nodepool instances
Closed, DeclinedPublic

Description

@mobrovac was wondering how one can SSH into a Nodepool instance for debugging purposes. Since they are not hooked with LDAP, the only way right now is to:

  • head to labnodepool1001.eqiad.wmnet
  • become-nodepool
  • ssh jenkins@<instance ip>

In T128175#2066977, @JanZerebecki proposed to copy the keys from ldap into place at instance creation or image build time.

Note: In principle these nodes can be unprivileged. In practice when they run gate-and-submit (pre merge) or any post merge action they might have equivalent access to directly push to the repo, make and publish releases and other trusted build artifacts. So it would be nice to be able to differentiate these actions or their simulation when considering which keys get access.

Event Timeline

When I originally provisioned the images, I tried to reuse the puppet related class to install the LDAP backed authentication layer. But at some point, that causes the whole authentication to refuse any auth since it tries to access the labs LDAP (which is not reachable when building the image outside of labs obviously).

I think the LDAP access also need a password, and I am not that confortable in having it in the CI instances. But maybe anonymous auth is possible.

Currently, the image uses cloud-init to inject a ssh key pair for the debian user which has sudo privileges. That let Nodepool interact with the image to complete it and snapshot it as a base image to boot instances from.

Later, when an instance boots, Nodepool uses the jenkins user which is provisioned when the image is created with an authorized_key entry matching the ssh private for nodepool@labnodepool1001.eqiad.wmnet.


Note: In principle these nodes can be unprivileged. In practice when they run gate-and-submit (pre merge) or any post merge action they might have equivalent access to directly push to the repo, make and publish releases and other trusted build artifacts. So it would be nice to be able to differentiate these actions or their simulation when considering which keys get access.

The credentials are hold in the Jenkins credential store and are injected for jobs that requires it. Sometime we even get a ssh-agent automatically started and holding the credentials.

I think the LDAP access also need a password, and I am not that confortable in having it in the CI instances. But maybe anonymous auth is possible.

The LDAP bindpw for the proxyagent user? It's public and shared with all labs instances, so...

The LDAP bindpw for the proxyagent user? It's public and shared with all labs instances, so...

My understanding was that it is more or less private to labs, if we were to inject the LDAP password in the Nodepool image, we would need to make it publicly available.

There is a very annoying way to get root access on an instance. That requires generating a ssh key pair and spawn an instance with that key pair attached to it. cloud-init then inject the credential on boot which grant root access.

I have added the documentation at https://wikitech.wikimedia.org/wiki/Nodepool#Connect_to_an_instance . That still does not solve this task though.

The LDAP bindpw for the proxyagent user? It's public and shared with all labs instances, so...

My understanding was that it is more or less private to labs, if we were to inject the LDAP password in the Nodepool image, we would need to make it publicly available.

Shared with all labs instances = public
https://github.com/wikimedia/labs-private/blob/HEAD/hieradata/eqiad.yaml#L12

We could have configured the labs LDAP in Nodepool instances. However Nodepool is now legacy and we are moving to Docker containers which make it trivial to reproduce a build on a local machine.