Page MenuHomePhabricator

Decide how we're going to handle certificates for the puppetmaster migration
Closed, ResolvedPublic

Description

I imagine we'd need to issue every instance being moved a new puppet cert, as we presumably wouldn't want to hand the current labs puppetmaster CA over to the new instance? That might be fairly easy due to autosigning though.

I think we can just have a fresh puppet CA and issue new certs everywhere, possible using a bit of cumin magic, unless anyone has any objections, but I would like to check people are fine with this.

Event Timeline

A new CA and certs seems fine to me. The only place that I remember us having puppet cert related issues in the past is Toolforge which I think uses the puppet certs for some non-puppet authn/z controls. Toolforge instances are attached to their own puppetmaster, so that won't be an issue here.

I don't *think* these are reused in other tools off the top of my head like the ones for the tools puppetmaster are. etcd around here uses puppet certs and freaks out when you change them, for instance. Where I see the problem is that the puppetmaster in tools is connected to labs-puppetmaster. Therefore, it's own certs become invalid if you replace the CA.

Yeah hopefully anything doing that is using a project puppetmaster.

The tools "standalone" puppetmaster is not actually standalone. It is puppetized on the upstream master while all other clients in tools connect to it instead.

Where I see the problem is that the puppetmaster in tools is connected to labs-puppetmaster. Therefore, it's own certs become invalid if you replace the CA.

Wait, what? Why is it not set up to serve itself? Still I'm actually not sure changing the central CA would effect tools' own CA?

toolsbeta and tools do this. They depend on some secrets on labs-puppetmaster, possibly. I'm not sure exactly why.

I'd prefer copying over the certs if possible to save cleanup work and similar. Kubernetes also uses puppet certs. Since there's a "general-k8s" project, they may also use them (I haven't checked).

toolsbeta and tools do this. They depend on some secrets on labs-puppetmaster, possibly. I'm not sure exactly why.

I'm not sure any secrets on labs-puppetmaster would be actually secret, labs central puppetmaster autosigns certificates. Please send me the details of anything cherry-picked (or otherwise added) on top of the usual repos on labs-puppetmaster privately.

I don't see any local commits on a quick check. As I said, I'm not sure exactly why they are built that way. However, they are.

I'd prefer copying over the certs if possible to save cleanup work and similar. Kubernetes also uses puppet certs. Since there's a "general-k8s" project, they may also use them (I haven't checked).

Okay. Someone would have to sign off on exporting the CA private key (which is presumably treated like a prod secret right now) into labs.
general-k8s project appears to have 0 instances.

general-k8s project appears to have 0 instances.

That helps :) I can't guarantee nobody else is doing something with puppet-as-CA, but that one non-tools example is good to hear that it isn't being used.

I support a new CA if we can somehow preserve the CA of the client puppetmasters, to be clear. I just don't immediately know how doable that is.

It's a good point, I think I'm going to have to test that. I could put in testlabs:

  • Central puppetmaster A, probably just set up as a project puppetmaster for the purposes of this test. Puppetmaster: self
  • Project puppetmaster B. Puppetmaster: Default central puppetmaster
  • Normal instance C. Puppetmaster: B

Then, when everything is set up, set B to use A instead of the current default central puppetmaster and see what happens?

This might be easier to draw on a diagram or something.

So, initial setup:

image.png (387×905 px, 37 KB)

After:

image.png (382×901 px, 37 KB)

And we need to check specifically that the C->B interaction still works as one would expect.

Now that I've set up 'New central puppetmaster A' I should probably add that it has a separate puppetmaster to provide it certain secrets e.g. encapi DB credentials and some private keys for the 'puppet' and 'puppetmaster.cloudinfra.wmflabs.org' names, so I should not have added the loop above that entry on the diagram.
This is equivalent to how the existing central puppetmaster works as that will be getting those secrets from the prod puppetmaster, also not pictured.

Anyway, I've run the test using krenair-t219424-b.testlabs.eqiad.wmflabs and krenair-t219424-c.testlabs.eqiad.wmflabs (using the new central puppetmaster I've set up for the parent ticket) and C seems entirely unaffected by B's own puppetmaster changing. They're both running puppet quite happily.

root@krenair-t219424-b:~#  cd /var/lib/puppet; mv ssl ssl.$(date '+%Y-%m-%dT%H:%M'); curl https://phab.wmfusercontent.org/file/data/sp3m7a6mjr53xfwlidz7/PHID-FILE-s4vhserqjh34z764hk6s/raw.txt -o /usr/local/share/ca-certificates/Puppet_Internal_CA.crt -s; update-ca-certificates --fresh; puppet agent -tv 
Clearing symlinks in /etc/ssl/certs...
done.
Updating certificates in /etc/ssl/certs...
157 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Info: Creating a new SSL key for krenair-t219424-b.testlabs.eqiad.wmflabs
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for krenair-t219424-b.testlabs.eqiad.wmflabs
Info: Certificate Request fingerprint (SHA256): BF:89:F3:9B:AA:51:A4:5E:02:1A:D5:B7:87:35:3B:A9:2E:FF:06:57:BC:49:61:05:FE:17:FE:1A:53:09:E3:63
Info: Caching certificate for krenair-t219424-b.testlabs.eqiad.wmflabs
Info: Caching certificate_revocation_list for ca
Info: Caching certificate for ca
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for krenair-t219424-b.testlabs.eqiad.wmflabs
Notice: /Stage[main]/Base::Environment/Tidy[/var/tmp/core]: Tidying 0 files
Info: Applying configuration version '1554662935'
Info: Computing checksum on file /etc/ssh/userkeys/gitpuppet
Info: /Stage[main]/Ssh::Server/File[/etc/ssh/userkeys/gitpuppet]: Filebucketed /etc/ssh/userkeys/gitpuppet to puppet with sum 53b4cfbe73412a1283b34276c60c2fea
Notice: /Stage[main]/Ssh::Server/File[/etc/ssh/userkeys/gitpuppet]/ensure: removed
Info: Computing checksum on file /usr/local/lib/nagios/plugins/check_puppet-needs-merge
Info: /Stage[main]/Nrpe/File[/usr/local/lib/nagios/plugins/check_puppet-needs-merge]: Filebucketed /usr/local/lib/nagios/plugins/check_puppet-needs-merge to puppet with sum 979955f544ce49c3f4a830f967ac991d
Notice: /Stage[main]/Nrpe/File[/usr/local/lib/nagios/plugins/check_puppet-needs-merge]/ensure: removed
Info: Computing checksum on file /etc/diamond/collectors/CherryPickCounterCollector.conf
Info: /Stage[main]/Diamond/File[/etc/diamond/collectors/CherryPickCounterCollector.conf]: Filebucketed /etc/diamond/collectors/CherryPickCounterCollector.conf to puppet with sum 3f34c7d1c551057a7362e91835e75808
Notice: /Stage[main]/Diamond/File[/etc/diamond/collectors/CherryPickCounterCollector.conf]/ensure: removed
Notice: openstack::clientpackages::mitaka::stretch: no special configuration yet
Notice: /Stage[main]/Openstack::Clientpackages::Mitaka::Stretch/Notify[openstack::clientpackages::mitaka::stretch: no special configuration yet]/message: defined 'message' as 'openstack::clientpackages::mitaka::stretch: no special configuration yet'
Notice: The LDAP client stack for this host is: classic
Notice: /Stage[main]/Profile::Ldap::Client::Labs/Notify[LDAP client stack]/message: defined 'message' as 'The LDAP client stack for this host is: classic'
Notice: Applied catalog in 5.01 seconds
root@krenair-t219424-b:/var/lib/puppet# grep server /etc/puppet/puppet.conf
server = puppetmaster.cloudinfra.wmflabs.org
ssldir = /var/lib/puppet/server/ssl/
hostcert = /var/lib/puppet/server/ssl/certs/krenair-t219424-b.testlabs.eqiad.wmflabs.pem
hostprivkey = /var/lib/puppet/server/ssl/private_keys/krenair-t219424-b.testlabs.eqiad.wmflabs.pem
root@krenair-t219424-b:/var/lib/puppet#
krenair@krenair-t219424-c:~$ sudo puppet agent -tv
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for krenair-t219424-c.testlabs.eqiad.wmflabs
Notice: /Stage[main]/Base::Environment/Tidy[/var/tmp/core]: Tidying 0 files
Info: Applying configuration version '1554662942'
Notice: openstack::clientpackages::mitaka::stretch: no special configuration yet
Notice: /Stage[main]/Openstack::Clientpackages::Mitaka::Stretch/Notify[openstack::clientpackages::mitaka::stretch: no special configuration yet]/message: defined 'message' as 'openstack::clientpackages::mitaka::stretch: no special configuration yet'
Notice: The LDAP client stack for this host is: classic
Notice: /Stage[main]/Profile::Ldap::Client::Labs/Notify[LDAP client stack]/message: defined 'message' as 'The LDAP client stack for this host is: classic'
Notice: Applied catalog in 3.77 seconds
krenair@krenair-t219424-c:~$ grep server /etc/puppet/puppet.conf
server = krenair-t219424-b.testlabs.eqiad.wmflabs
krenair@krenair-t219424-c:~$

Don't think I understand the gitpuppet/CherryPickCounterCollector stuff on B yet though that's not quite this ticket.
Edit: Yep, got it. B no longer has the puppetmaster role because the new central puppetmaster it talks to is using it's own ENC, which is set up but has not had the data imported from the live one, so everything under it is missing roles/hiera configured in horizon. Due to this role being missing, certain files are absented due to being missing from the catalogue.

root@cloud-puppetmaster-01:~# cat /etc/puppet-enc.yaml
host: puppetmaster.cloudinfra.wmflabs.org

root@cloud-puppetmaster-01:~# echo 'host: labs-puppetmaster.wikimedia.org' > /etc/puppet-enc.yaml
root@cloud-puppetmaster-01:~# /usr/local/bin/puppet-enc krenair-t219424-b.testlabs.eqiad.wmflabs
classes: ['role::puppetmaster::standalone']
parameters: {}
root@cloud-puppetmaster-01:~# echo 'host: puppetmaster.cloudinfra.wmflabs.org' > /etc/puppet-enc.yaml
root@cloud-puppetmaster-01:~# /usr/local/bin/puppet-enc krenair-t219424-b.testlabs.eqiad.wmflabs
classes: []
parameters: {}
Krenair claimed this task.