Switch k8s logs to their own kafka topics
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	fgiunchedi
	Wed, Jun 5, 3:01 PM

Description

Logs for all k8s clusters currently flow into rsyslog-* kafka topics (split by severity). While this is simple and works under normal circumstances, in case of even a single spammy producer then all other producers are affected by the caused lag.

Similarly to what we do with prometheus, we should instead switch to a model where kafka-logging topics are isolated/split at least by k8s cluster, if not even more (e.g. cluster + namespace).

As a bonus side effect, moving to this model also effectively will increase the logstash ingestion capacity since we will be able to consume from more topics concurrently, as opposed to a single funnel/topic. Also at the moment normally we have 6 partitions and 6 logstash consumers, so effectively each consumes single-thread from a given topic.

Details

Subject	Repo	Branch	Lines +/-
logstash: consume k8s logs topics	operations/puppet	production	+33 -0
logstash: add auto_offset_reset to kafka input	operations/puppet	production	+7 -0
k8s: send logs to per-cluster kafka topics	operations/puppet	production	+41 -2