Page MenuHomePhabricator

Encrypt Kafka traffic, and restrict access via ACLs
Closed, ResolvedPublic0 Estimated Story Points

Description

For now we're securing the caches' kafka broker traffic with IPSec ( T92602 ), but longer-term we need to move this to TLS.

We'd like to encrypt all Kafka traffic, both inter broker and between clients and brokers.

rough idea

  • Make cassandra-ca-manager script generic, and also make it output other useful certificate formats (.pem?) needed by non Java clients.
  • Use ca-manager script to generate CA certs, broker keys and client keys needed to configure Kafka TLS.
  • Client keys will be generated per logical client, not per client instance. E.g. varnishkafka-webrequest instances will all share the same client keys, distributed by puppet.
  • The CA certs will be distributed to all hosts that might run Kafka clients (e.g. Hadoop nodes), to support unauthenticated use cases (still encrypted).
  • Logical client keys will be distributed via puppet only to hosts that will run those specific clients.

Note that this plan doesn't yet consider encryption of traffic between Kafka and Zookeeper. Should we?

Event Timeline

BBlack created this task.Dec 15 2015, 7:00 PM
BBlack raised the priority of this task from to Needs Triage.
BBlack updated the task description. (Show Details)
BBlack added projects: Traffic, Analytics.
BBlack added a subscriber: BBlack.
Restricted Application added a project: Operations. · View Herald TranscriptDec 15 2015, 7:00 PM
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript
Ottomata claimed this task.Dec 15 2015, 7:01 PM
Ottomata set Security to None.
Ottomata renamed this task from Upgrade kafka for native TLS and secure the kafka traffic with it to Enable Kafka native TLS in 0.9 and secure the kafka traffic with it.Dec 15 2015, 7:03 PM
Milimetric moved this task from Incoming to temporary on the Analytics board.Jan 12 2016, 7:28 PM
Milimetric moved this task from temporary to Incoming on the Analytics board.Jan 12 2016, 7:35 PM
Milimetric moved this task from Incoming to temporary on the Analytics board.Jan 12 2016, 7:38 PM
Milimetric moved this task from temporary to Incoming on the Analytics board.Jan 12 2016, 7:43 PM
Milimetric triaged this task as Medium priority.Mar 7 2016, 5:15 PM
Milimetric moved this task from Incoming to Event Platform on the Analytics board.
Ottomata removed Ottomata as the assignee of this task.Aug 8 2016, 7:50 PM
Ottomata added a subscriber: Ottomata.
BBlack moved this task from Triage to Watching on the Traffic board.Oct 4 2016, 12:47 PM
Nuria moved this task from Event Platform to Wikistats on the Analytics board.Oct 31 2016, 3:54 PM
Nuria moved this task from Wikistats to Dashiki on the Analytics board.
Ottomata renamed this task from Enable Kafka native TLS in 0.9 and secure the kafka traffic with it to Enable Kafka TLS and secure the kafka traffic with it.May 17 2017, 12:57 PM
Ottomata updated the task description. (Show Details)
Ottomata updated the task description. (Show Details)

It looks like a lot of the hard work for this has been done for Cassandra over in T108953 and T111113. Documentation for this is at https://wikitech.wikimedia.org/wiki/Cassandra/Tools/cassandra-ca-manager and https://wikitech.wikimedia.org/wiki/Cassandra#Installing_and_generating_certificates.

We can use that tool (perhaps after giving it a more generic name :) ), to generate signed certs, truststore, and keystore for each broker, add them to puppet private, distribute them using puppet secret(), and then point Kafka at the truststore and keystore.

Ottomata merged a task: Restricted Task.May 18 2017, 10:55 AM
Ottomata added subscribers: faidon, Nuria, elukey and 3 others.
Ottomata renamed this task from Enable Kafka TLS and secure the kafka traffic with it to Encrypt Kafka traffic, and restrict access via ACLs.May 18 2017, 11:01 AM
Ottomata updated the task description. (Show Details)

We should do some work to understand how ACLs work and what ACLs for what topics we should set in production.

elukey moved this task from In Progress to Analytics Backlog on the User-Elukey board.
mforns edited projects, added Analytics-Kanban; removed Analytics.Jul 31 2017, 3:41 PM
mforns set the point value for this task to 0.
mforns moved this task from Next Up to Parent Tasks on the Analytics-Kanban board.
elukey added a comment.Aug 4 2017, 9:50 AM

We should do some work to understand how ACLs work and what ACLs for what topics we should set in production.

This was done in https://phabricator.wikimedia.org/T167304

elukey added a comment.Aug 4 2017, 9:53 AM

Note that this plan doesn't yet consider encryption of traffic between Kafka and Zookeeper. Should we?

We've set up strict firewall rules for our zookeeper clusters to allow only few unix groups to be able to communicate with Zookeeper. This has been considered enough for our purposes, but we have to keep in mind that Kafka ACLs are stored in Zookeeper so future access to those groups will need to be carefully reviewed :)

@Ottomata should we keep this task open given that we already have https://phabricator.wikimedia.org/T166167 ?

Let's keep it open and use this task to track actually enabling TLS / ACLs for different clients.

Change 394144 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Puppetize SSL for Kafka broker

https://gerrit.wikimedia.org/r/394144

Change 394144 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Puppetize SSL for Kafka broker

https://gerrit.wikimedia.org/r/394144

Change 394144 merged by Ottomata:
[operations/puppet@production] Puppetize SSL for Kafka broker

https://gerrit.wikimedia.org/r/394144

Change 394368 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] chgrp server.properties to 'kafka' so daemon can read it

https://gerrit.wikimedia.org/r/394368

Change 394368 merged by Ottomata:
[operations/puppet@production] chgrp server.properties to 'kafka' so daemon can read it

https://gerrit.wikimedia.org/r/394368

Change 394383 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Disable ssl for kafka-jumbo

https://gerrit.wikimedia.org/r/394383

Change 394383 merged by Ottomata:
[operations/puppet@production] Disable ssl for kafka-jumbo

https://gerrit.wikimedia.org/r/394383

Change 404698 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Parameterize varnishkafka certificate name for easier setup in Cloud VPS.

https://gerrit.wikimedia.org/r/404698

Change 404698 merged by Ottomata:
[operations/puppet@production] Parameterize varnishkafka certificate name for easier setup in Cloud VPS.

https://gerrit.wikimedia.org/r/404698

Change 404706 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[labs/private@master] Update secrets/certificates with deployment-prep certs for TLS Kafka

https://gerrit.wikimedia.org/r/404706

Change 404706 merged by Ottomata:
[labs/private@master] Update secrets/certificates with deployment-prep certs for TLS Kafka

https://gerrit.wikimedia.org/r/404706

Milimetric moved this task from Dashiki to Incoming on the Analytics board.Apr 2 2018, 3:33 PM
Milimetric moved this task from Dashiki to Incoming on the Analytics board.
Nuria moved this task from Incoming to Kafka Work on the Analytics board.Apr 5 2018, 5:02 PM
mforns closed this task as Resolved.Apr 16 2018, 4:14 PM
mforns claimed this task.
Aklapper removed a project: Analytics.Jul 4 2020, 7:59 AM