While discussing the logging design document @Joe has expressed doubts about sharing Kafka main cluster for logging purposes. The scope of this task is to discuss whether logging usage for Kafka main cluster is appropriate or not. If not we'd have to spin up a dedicated Kafka cluster for logging, possibly on new hardware.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | fgiunchedi | T205849 Begin the implementation of Q1's Logging Infrastructure design (2018-19 Q2 Goal) | |||
Resolved | fgiunchedi | T205873 Investigate Kafka main cluster usage for logging pipeline | |||
Resolved | fgiunchedi | T203169 Logstash hardware expansion | |||
Resolved | herron | T211065 rack/setup/install codfw logstash elasticsearch storage servers | |||
Resolved | herron | T211859 cronspam from elasticsearch-curator on stretch |
Event Timeline
At yesterday's monitoring/logging meeting we've discussed this and concluded that for good hygiene and decoupling it makes sense to spin up a new Kafka cluster for logging purposes. What's left to decide on which hardware we're going to run Kafka on, which in turn boils down to a budget question, see also T203169: Logstash hardware expansion
While the optimal approach would be to dedicate new hardware to Kafka in both eqiad and codfw, after a few conversations within the infra team it sounds like a realistic approach to support Q2 goal T205849 will be to approach this in phases. In terms of hardware these would look like...
Phase 1 - Deploy logging Kafka onto the same hardware as the logstash elasticsearch in eqiad and codfw (3 hosts per-site)
Phase 2 - Procure dedicated hardware (additional 3 hardware hosts per-site) for logging Kafka, and migrate the Kafka service from the elasticsearch hosts to their own dedicated hardware.
Looks like we have a way forward! Resolving in favor of T206454: Setup Kafka cluster, producers and consumers for logging pipeline to track the actual Kafka setup work.