Page MenuHomePhabricator

Investigate Kafka main cluster usage for logging pipeline
Closed, ResolvedPublic

Description

While discussing the logging design document @Joe has expressed doubts about sharing Kafka main cluster for logging purposes. The scope of this task is to discuss whether logging usage for Kafka main cluster is appropriate or not. If not we'd have to spin up a dedicated Kafka cluster for logging, possibly on new hardware.

Event Timeline

At yesterday's monitoring/logging meeting we've discussed this and concluded that for good hygiene and decoupling it makes sense to spin up a new Kafka cluster for logging purposes. What's left to decide on which hardware we're going to run Kafka on, which in turn boils down to a budget question, see also T203169: Logstash hardware expansion

herron triaged this task as High priority.Oct 2 2018, 5:25 PM

While the optimal approach would be to dedicate new hardware to Kafka in both eqiad and codfw, after a few conversations within the infra team it sounds like a realistic approach to support Q2 goal T205849 will be to approach this in phases. In terms of hardware these would look like...

Phase 1 - Deploy logging Kafka onto the same hardware as the logstash elasticsearch in eqiad and codfw (3 hosts per-site)
Phase 2 - Procure dedicated hardware (additional 3 hardware hosts per-site) for logging Kafka, and migrate the Kafka service from the elasticsearch hosts to their own dedicated hardware.

fgiunchedi claimed this task.

Looks like we have a way forward! Resolving in favor of T206454: Setup Kafka cluster, producers and consumers for logging pipeline to track the actual Kafka setup work.