Now that we have moved all traffic to HAProxy, we should replace the current pipeline
- Varnish logs -> Varnishkafka -> Kafka
to
- HAProxy -> RSyslog -> Kafka
The format of the messages sent to Kafka should remain the same, so for the content, to avoid breaking existing pipelines and tools.
Roughly, the actions needed are:
- Investigate whether the log format can be transformed to match the VarnishKafka one (checkout the VarnishKafka configuration file for more information)
- Configure RSyslog to send the data initially to a different Kafka topic(s)
- Compare the existing topic(s) content to the new one, to spot eventual differences
- Finalize configuring RSyslog to send data to the existing Kafka topic(s) and remove VarnishKafka
To test the actual feasibility we could use the deployment-prep environment, as there's already a kafka cluster in there, and varnishkafka configured on cache hosts (NB. investigate why the deployment-prep jumbo cluster isn't receiving any message from varnishkafka).
Another viable option (thanks to @brouberol ) could be use the kafka-test cluster in production, configuring one cp production host to send from rsyslog to this cluster on a disposable topic, and compare to the actual messages sent by VarnishKafka.