Page MenuHomePhabricator

DRY kafka broker declaration in helmfiles
Open, MediumPublic

Description

Puppet has the authoritative list of Kafka brokers. Helmfiles that use Kafka hardcode that list in, and when SRE changes kafka brokers (like in T279342), helmfiles must be updated too. This is error prone and can lead to problems if SRE is not aware of what services depend on Kafka.

We attempted to use puppet to render kafka broker info into the general.yaml value file, but this doesn't quite work, because the Helm charts would have to know how to use this, and the values vary per DC.

We should see if there is a DNS or LVS based solution for this, so we don't have to hardcode lists of kafka brokers.

Event Timeline

Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.

Change 656253 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Render kafka cluster connection info in helmfile-defaults/general-*.yaml

https://gerrit.wikimedia.org/r/656253

Change 656253 merged by Ottomata:
[operations/puppet@production] Render kafka cluster connection info in helmfile-defaults/general-*.yaml

https://gerrit.wikimedia.org/r/656253

Ottomata renamed this task from DRY kafka broker declaration into helmfiles from puppet to DRY kafka broker declaration in helmfiles.Apr 16 2021, 3:55 PM
Ottomata raised the priority of this task from Low to Medium.
Ottomata updated the task description. (Show Details)
Ottomata added projects: SRE, serviceops.
Ottomata added subscribers: herron, fgiunchedi, colewhite.

Actually, I'm not sure even just doing LVS would help here. The helmfiles networkpolicy explicitly lists IP addresses that the service can talk to. The broker IPs would still have to manually updated in networkpolicy in values.yaml file.

@akosiaris @JMeybohm any ideas?

Hi!

Adopting the new functionality in networkpolicy resources has indeed created some tech debt. It's a tech debt we created on purpose while devoting resources to finalize the migration away from the old way of maintaining those networkpolicies. Now that that's gone, I want to revisit it and deduplicate it as much as possible.

I have a couple of approaches for that in mind, I 'll try and upload a couple of changes this week.

Change 682971 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] kubernetes::deployment_server: also add kafka broker, pass CIDRs

https://gerrit.wikimedia.org/r/682971

Change 683379 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] networkpolicy: add autogenerated egress rules

https://gerrit.wikimedia.org/r/683379

Change 682971 merged by Giuseppe Lavagetto:

[operations/puppet@production] kubernetes::deployment_server: also add kafka broker, pass CIDRs

https://gerrit.wikimedia.org/r/682971

Change 683379 merged by jenkins-bot:

[operations/deployment-charts@master] networkpolicy: add autogenerated egress rules

https://gerrit.wikimedia.org/r/683379

Change 684855 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] eventgate: add kafka egress policy stanza

https://gerrit.wikimedia.org/r/684855