Page MenuHomePhabricator

kafka_mirror_maker TLS cert about to expire - 2023
Closed, ResolvedPublic

Description

Kafka MirrorMaker is used for multi DC kafka, as well as aggregating all Kafka topics to the Kafka jumbo-eqiad cluster.

The Kafka clients are all configured to use a cergen created and Puppet CA signed TLS certificate. This certificate is about to expire, on Tuesday June 13.

We should either:

A. Renew the certificate: https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate

or

B. Create a new certificate using our newer PKI system.

Related Objects

Event Timeline

Some info - the kafka mirror maker's cergen TLS cert has kafka_mirror_maker as CN, that is used as "username" in Kafka ACLs:

elukey@kafka-main1001:~$ kafka acls --list
kafka-acls --authorizer-properties zookeeper.connect=conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/main-eqiad --list

Current ACLs for resource `Group:*`: 
	User:CN=kafka_mirror_maker has Allow permission for operations: Read from hosts: * 

Current ACLs for resource `Topic:*`: 
	User:CN=kafka_mirror_maker has Allow permission for operations: Write from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Describe from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Read from hosts: *

Current ACLs for resource `Cluster:kafka-cluster`: 
	User:CN=kafka_mirror_maker has Allow permission for operations: Create from hosts: *
	User:CN=varnishkafka has Allow permission for operations: Create from hosts: *



kafka acls --liselukey@kafka-jumbo1001:~$ kafka acls --list
kafka-acls --authorizer-properties zookeeper.connect=conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --list

Current ACLs for resource `Topic:*`: 
	User:CN=kafka_fundraising_client has Allow permission for operations: Describe from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Write from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Describe from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Read from hosts: *

Current ACLs for resource `Cluster:kafka-cluster`: 
	User:CN=kafka_mirror_maker has Allow permission for operations: Create from hosts: *

Current ACLs for resource `Group:*`: 
	User:CN=kafka_fundraising_client has Allow permission for operations: Read from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Read from hosts: *
	User:CN=kafka_mirror_maker has Allow permission for operations: Describe from hosts: *

If we want to use PKI we should figure out how a PKI client cert would look like (maybe @jbond can chime in with some suggestions).

maybe @jbond can chime in with some suggestions

Happy too but i may need some more context :) specifically what is the endpoint that theses certs authenticate to (and is that allready managed by pki). also where in puppet are the certgen certs currently used. also happy to jump on a quick call if its easier.

specifically what is the endpoint that theses certs authenticate to

Kafka brokers

is that allready managed by pki

yup! Thanks to Luca!

where in puppet are the certgen certs currently used

profile::kafka::mirror has decent code docs.

@jbond thanks for checking! I think that the main question mark is what a client cert for kafka mirror maker (and potentially also varnishkafka) would look like.. In both cases we have the related client code running on multiple nodes, but sharing the same cergen certificate. The CN used in the certgen certificate is then used by Kafka as "username" to match the correspondent ACLs etc.. It would be possible for us to add ACLs based on hostnames, but it would be very brittle and error prone (say if we change a node and we forget to update the ACLs etc..). So ideally it would be nice to have a single generic hostname-free CN to match in Kafka ACLs, but not sure what is doable with the current PKI infra. Let us know when you have a moment what possibilities there are :)

@jbond thanks for checking! I think that the main question mark is what a client cert for kafka mirror maker (and potentially also varnishkafka) would look like.. In both cases we have the related client code running on multiple nodes, but sharing the same cergen certificate. The CN used in the certgen certificate is then used by Kafka as "username" to match the correspondent ACLs etc.. It would be possible for us to add ACLs based on hostnames, but it would be very brittle and error prone (say if we change a node and we forget to update the ACLs etc..). So ideally it would be nice to have a single generic hostname-free CN to match in Kafka ACLs, but not sure what is doable with the current PKI infra. Let us know when you have a moment what possibilities there are :)

We can create multiple certs with the same CN on different machines (or even on the same machine). Thats used in kubernetes context already for client certs.

Change 922795 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] profile::kafka::mirror: add support for PKI certificate

https://gerrit.wikimedia.org/r/922795

We can create multiple certs with the same CN on different machines (or even on the same machine). Thats used in kubernetes context already for client certs.

Exactly thanks @JMeybohm, if it doesn't cause issues id add the hostname as an SNI but its not a big deal

Change 922795 merged by Elukey:

[operations/puppet@production] profile::kafka::mirror: add support for PKI certificate

https://gerrit.wikimedia.org/r/922795

Mentioned in SAL (#wikimedia-analytics) [2023-05-24T15:48:44Z] <elukey> run kafka acls --add --allow-principal User:CN=kafka_mirror_maker --producer --topic '*' on kafka test - T337248

Mentioned in SAL (#wikimedia-operations) [2023-05-24T15:56:28Z] <elukey> move kafka mirror on kafka jumbo brokers to PKI - T337248

Mentioned in SAL (#wikimedia-analytics) [2023-05-24T15:56:32Z] <elukey> move kafka mirror on kafka jumbo brokers to PKI - T337248

Mentioned in SAL (#wikimedia-operations) [2023-05-24T16:05:37Z] <elukey> move kafka mirror on kafka main brokers to PKI - T337248

Mentioned in SAL (#wikimedia-analytics) [2023-05-24T16:05:41Z] <elukey> move kafka mirror on kafka main brokers to PKI - T337248

Rolled out the new keystores to all clusters!

Next steps:

Mentioned in SAL (#wikimedia-operations) [2023-05-25T08:32:01Z] <elukey> revoke kafka_mirror_maker TLS cert (cergen based), remove old cergen certs from puppet private - T337248

Change 923259 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] profile::kafka::{broker,mirror}: simplify dependencies

https://gerrit.wikimedia.org/r/923259

Change 923259 merged by Elukey:

[operations/puppet@production] profile::kafka::{broker,mirror}: simplify dependencies

https://gerrit.wikimedia.org/r/923259

elukey claimed this task.