Page MenuHomePhabricator

Modernize etcd tlsproxy certificate management
Open, In Progress, MediumPublic

Description

profile::etcd::tlsproxy do not use cergen certs directly, but both codfw and eqiad have been rotated using cergen-generated certs manually placed in profile/files/ssl/ and keys placed in the private ssl dir.

Is all that needs to be done is set use_cergen => true?

Event Timeline

We have to differentiate between:

profile::etcd::tlsproxy

and

profile::etcd::v3

both have a sslcert::certificate but there is the comment

# TLS certs *for etcd use* in peer-to-peer communications.
# Tlsproxy will use other certificates.

We have to differentiate between:

profile::etcd::tlsproxy

and

profile::etcd::v3

both have a sslcert::certificate but there is the comment

# TLS certs *for etcd use* in peer-to-peer communications.
# Tlsproxy will use other certificates.

Both have cergen-generated certs, but puppet is not configured to use them in-place. It may be worthwhile to transition both if cergen is the expected way of provisioning these certs.

Things done in reaction to the page on the weekend:

add new certificate for etcd-v3.eqiad.wmnet - https://gerrit.wikimedia.org/r/c/operations/puppet/+/787884
hiera: tlsproxy: use new etcd-v3 certificate - https://gerrit.wikimedia.org/r/c/operations/puppet/+/787885

plus private repo changes to add yaml for cert generation and moving the key in place

Change 788437 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] etcd::tlsproxy: set use_cergen to true

https://gerrit.wikimedia.org/r/788437

Change 788439 had a related patch set uploaded (by Dzahn; author: Dzahn):

[labs/private@master] add fake certificates for etcd-v3.eqiad and etcd-v3.codfw

https://gerrit.wikimedia.org/r/788439

Change 788439 merged by Dzahn:

[labs/private@master] add fake certificates and keys for etcd-v3.eqiad and etcd-v3.codfw

https://gerrit.wikimedia.org/r/788439

Change 790657 had a related patch set uploaded (by Dzahn; author: jbond):

[operations/puppet@production] P:etcd::tlsproxy: move to cfssl pki

https://gerrit.wikimedia.org/r/790657

Dzahn changed the task status from Open to In Progress.May 13 2022, 8:05 PM

Change 791671 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] delete expired certs etcd.eqiad.wmnet.crt and etcd.codfw.wmnet.crt

https://gerrit.wikimedia.org/r/791671

Change 788437 merged by Dzahn:

[operations/puppet@production] etcd::tlsproxy: set use_cergen to true

https://gerrit.wikimedia.org/r/788437

Marostegui triaged this task as Medium priority.May 17 2022, 8:20 AM

Change 791671 merged by Dzahn:

[operations/puppet@production] delete expired certs etcd.eqiad.wmnet.crt and etcd.codfw.wmnet.crt

https://gerrit.wikimedia.org/r/791671

@Dzahn anything left to do here? Would it be a good sprint week thing?

I think there is a larger topic of moving etcd to use the new PKI certs. There has been some work in that direction but I think that work will take some time. Should we just go on and do the minimal fix this week?

I think there is a larger topic of moving etcd to use the new PKI certs. There has been some work in that direction but I think that work will take some time. Should we just go on and do the minimal fix this week?

might need refreshing/updating but there is this https://gerrit.wikimedia.org/r/c/operations/puppet/+/790657