At the ops offsite it was discussed what to do for secure internal
communication, in particular for TLS certificates. Currently the puppet
certificates can be "exposed" to applications that need secure communication
inside the cluster. The puppet certificate provides "host
authenticity/identity" since it is signed by the Puppet CA. The same
certificate also provides access to all of the host's secrets via the Puppet
catalog (e.g. in case of a certificate being leaked or otherwise compromised).
For services that don't require root (for services that do this point is moot
- they can already access the host certificate) we can provide instead a
"Service CA". Namely, upon provisioning a new service (the definition of
"service" is loose: anything that might require certs) also provision a CA
keypair for the service and distribute it to the service hosts.
The service hosts can each generate their own certs signed by the CA and
present those for secure communication. Hosts not running the service can
request the service CA public key to be trusted (e.g. via puppet).
Note that this is a form of "insecure PKI" and arguably not a PKI at all,
though it chains off the existing trust for the puppet CA and self-bootstraps
since there's no manual signing required.
With respect to practical work to the Puppet and Service CAs see also related subtasks.
Other topics touched during the "PKI" discussion at the offsite, not all are
feasible nor scheduled to be done:
- "Cloudflare keyless ssl"
- Could we use Boulder (letsencrypt CA software) internally instead?
- How to bootstrap trust via the hardware's TPM?
Raw etherpad dump
* puppet PKI/CA ** is there a smarter way for CA rollover ** details needed on how puppet manages the CA ** renewal of CA needed for 2k host certs ** puppet management of the CA, how much is it embedded? ** can puppet generate/add SANs ** key rollover, how to do it? * new "service" == new "service CA" ** can be used for each service that doesn't need root access, in which case it can use the puppet host keys ** service gets the service CA private key, so it can generate certs signed from the CA ** "insecure PKI" aka "don't do PKI" ** and puppetization for this ** advantage is that chains off the existing puppet trust and self-bootstraps (each host can generate its cert) * cloudflare keyless ssl (wishlist, long term, no resources) * can we use letsencrypt internally? possibly moot given the "service CA" above * TPM, tangenltly related, how to bootstrap trust for puppet ** related to reimaging ** in theory wmf-reimage has a race condition * slightly related to PKI: tftp initrd/linux are loaded from the bastions! ** EFI additional modules loaded via PXE? ** PKI for hardware? just a password, never rotate, no certs