Page MenuHomePhabricator

Consider increasing kafka logging topic partitions
Closed, ResolvedPublic

Description

ATM for kafka logging topics all have one partition and three replicas. This configuration works but has the potential of a single partition overwhelming a broker especially as we're onboarding more producers.

We should consider increasing partitions to spread produce/consume load among available brokers.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 7 2019, 2:29 PM
herron added a subscriber: herron.Jan 11 2019, 3:53 PM

Change 484226 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: increase default kafka partitions for logging cluster

https://gerrit.wikimedia.org/r/484226

To change existing topics, the following command must be run per-topic:

kafka-topics --alter --zookeeper conf1004.eqiad.wmnet:2181/kafka/logging-eqiad --topic TOPIC --partitions 6
kafka-topics --alter --zookeeper conf2001.codfw.wmnet:2181/kafka/logging-codfw --topic TOPIC --partitions 6

Hm. In Kafka main clusters, we handle bursts of around 1000 msgs / sec per a single partition topic.

If you add more partitions, you'll lose some ordering guarantees. Q: Do you currently use a partition key for these messages? I wonder if adding a key by hostname would be helpful here. It'd allow any given consumer to always receive messages from the same host in order, which might help with ordering and debugging stuff downstream?

Or, maybe it doesn't matter! :)

fgiunchedi updated the task description. (Show Details)Jan 15 2019, 9:23 AM

Hm. In Kafka main clusters, we handle bursts of around 1000 msgs / sec per a single partition topic.

Thanks that useful to know! Once mediawiki logging is fully switched I'm expecting a baseline of ~750 msg/s, with bursts upwards on exceptional conditions (e.g. db unavailable), hence the idea to bump partitions.

If you add more partitions, you'll lose some ordering guarantees. Q: Do you currently use a partition key for these messages? I wonder if adding a key by hostname would be helpful here. It'd allow any given consumer to always receive messages from the same host in order, which might help with ordering and debugging stuff downstream?

Good point! We're using no key and consistent_random as librdkafka's default so effectively random. For the current consumer use case (i.e. three logstash consumers reading from kafka and writing to elasticsearch) I don't think per-host ordering is needed right away, though we should consider it when adding new consumers perhaps? (e.g. kafkatee)

Change 484226 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: increase default kafka partitions for logging cluster

https://gerrit.wikimedia.org/r/484226

Change 484440 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] profile: introduce kafka::broker num_partitions

https://gerrit.wikimedia.org/r/484440

Change 484440 merged by Filippo Giunchedi:
[operations/puppet@production] profile: introduce kafka::broker num_partitions

https://gerrit.wikimedia.org/r/484440

Mentioned in SAL (#wikimedia-operations) [2019-01-15T16:50:35Z] <godog> roll-restart kafka-logging in eqiad to apply new topic defaults - T213081

Mentioned in SAL (#wikimedia-operations) [2019-01-15T17:13:36Z] <godog> set partitions to 3 for existing kafka-logging topics - T213081

Mentioned in SAL (#wikimedia-operations) [2019-01-15T17:21:21Z] <godog> depool logstash1007 before restarting logstash - T213081

Mentioned in SAL (#wikimedia-operations) [2019-01-15T17:26:24Z] <godog> roll-restart logstash in eqiad - T213081

fgiunchedi closed this task as Resolved.Jan 16 2019, 9:07 AM
fgiunchedi claimed this task.

All topics have been switched to three partitions now, load is still imbalanced with logstash1004 getting the biggest chunk of traffic, I'm assuming due to the way the randomness on NULL key works out. We should consider also specifying the key in rsyslog configuration in the future.