Page MenuHomePhabricator

Repartition [eqiad|codfw].cirrussearch.update_pipeline.update.v1 topics in kafka-main@[eqiad|codfw]
Closed, ResolvedPublic

Description

We are going to move away from the *.cirrussearch.update_pipeline.update.rc0 topics that were useful as part of the development of the Search Update Pipeline.
Both schemas, stream & topic names should refer to a stable version now that the pipeline is running in production for all the wikis.
For this we need the two new v1 topics in each kafka-main clusters to be properly partitioned:

  • eqiad.cirrussearch.update_pipeline.update.v1: 5 partitions
  • codfw.cirrussearch.update_pipeline.update.v1: 5 partitions

Open question: we ask for 5 partitions because this is what we had for the rc topics, it was certainly set like this to spread the size across all 5 kafka nodes but on the consumption part we found that it was not ideal to have 5 because it required a min of 5 consumers (flink parallelism of 5 which is way more than we actually need) to spread evenly the workload. We were wondering if it would not make sense to reconsider this number to avoid a prime and possibly use 6 (or more?) partitions to allow for more flexibility? Let's revisit this as part of another task to avoid confusions.

Numbers:

  • the volume is expected to be exactly the same as rc0 topics, the topic in the active DC is expected to be around 120Gb (incl. replication), receive between 250 & 500 events/sec in normal conditions but can see surge with 1k events/sec in case it's backfilling after an outage.
  • we won't duplicate the data between rc0 and v1 topics so no additional space will be required for the transition

AC:

  • Determine if 5 partitions is still what we want
    • yes we might want to keep 5 for now and reconsider this as a separate task
  • [eqiad|codfw].cirrussearch.update_pipeline.update.v1 topics are properly partitioned in kafka-main[eqiad|codfw]

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Tagging serviceops for visibility and in case they have opinions on the partitioning scheme.

In the interest of time I'd suggest to keep 5 partitions for now

I can take care of this if ServiceOps agree with the partitioning (as this is kafka-main).

AIUI you will be moving events from rc0 to v1 topics, so there is no extra space used and there is no increase in messages (as the ones going to v1 will no longer go to rc0) right? If all that is true, I don't think we have anything to object.

AIUI you will be moving events from rc0 to v1 topics, so there is no extra space used and there is no increase in messages (as the ones going to v1 will no longer go to rc0) right? If all that is true, I don't think we have anything to object.

@JMeybohm thanks for taking a look! yes this is correct, we won't duplicate any data, the consumers might start consuming from both during the transition but the producers will only produce to one of them.

@JMeybohm would you be ok with me provisioning the topics?

@JMeybohm would you be ok with me provisioning the topics?

Sure, go ahead!

brouberol claimed this task.
brouberol@kafka-main1006:~$ kafka topics --topic eqiad.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/main-eqiad --topic eqiad.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!
brouberol@kafka-main1006:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/main-eqiad --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!
brouberol@kafka-main1006:~$ kafka topics --topic eqiad.cirrussearch.update_pipeline.update.v1 --describe
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/main-eqiad --topic eqiad.cirrussearch.update_pipeline.update.v1 --describe
Topic:eqiad.cirrussearch.update_pipeline.update.v1	PartitionCount:5	ReplicationFactor:3	Configs:
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 0	Leader: 1004	Replicas: 1004,1001,1002	Isr: 1004,1001,1002
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 1	Leader: 1005	Replicas: 1005,1001,1002	Isr: 1005,1001,1002
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 2	Leader: 1001	Replicas: 1001,1002,1003	Isr: 1001,1002,1003
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 3	Leader: 1002	Replicas: 1002,1003,1004	Isr: 1002,1003,1004
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 4	Leader: 1003	Replicas: 1003,1004,1001	Isr: 1003,1004,1001
brouberol@kafka-main1006:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --describe
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/main-eqiad --topic codfw.cirrussearch.update_pipeline.update.v1 --describe
Topic:codfw.cirrussearch.update_pipeline.update.v1	PartitionCount:5	ReplicationFactor:3	Configs:
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 0	Leader: 1002	Replicas: 1002,1003,1004	Isr: 1002,1003,1004
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 1	Leader: 1003	Replicas: 1003,1004,1001	Isr: 1003,1004,1001
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 2	Leader: 1004	Replicas: 1004,1001,1002	Isr: 1004,1001,1002
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 3	Leader: 1005	Replicas: 1005,1001,1002	Isr: 1005,1001,1002
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 4	Leader: 1001	Replicas: 1001,1002,1003	Isr: 1001,1002,1003

And the same in kafka-main-codfw:

brouberol@kafka-main2006:~$ kafka topics --topic eqiad.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
kafka-topics --zookeeper conf2004.codfw.wmnet,conf2005.codfw.wmnet,conf2006.codfw.wmnet/kafka/main-codfw --topic eqiad.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!
brouberol@kafka-main2006:~$ kafka topics --topic eqiad.cirrussearch.update_pipeline.update.v1 --describe
kafka-topics --zookeeper conf2004.codfw.wmnet,conf2005.codfw.wmnet,conf2006.codfw.wmnet/kafka/main-codfw --topic eqiad.cirrussearch.update_pipeline.update.v1 --describe
Topic:eqiad.cirrussearch.update_pipeline.update.v1	PartitionCount:5	ReplicationFactor:3	Configs:
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 0	Leader: 2001	Replicas: 2001,2002,2003	Isr: 2001,2002,2003
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 1	Leader: 2002	Replicas: 2002,2003,2004	Isr: 2002,2003,2004
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 2	Leader: 2003	Replicas: 2003,2004,2001	Isr: 2003,2004,2001
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 3	Leader: 2004	Replicas: 2004,2001,2002	Isr: 2004,2001,2002
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 4	Leader: 2005	Replicas: 2005,2001,2002	Isr: 2005,2001,2002
brouberol@kafka-main2006:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
kafka-topics --zookeeper conf2004.codfw.wmnet,conf2005.codfw.wmnet,conf2006.codfw.wmnet/kafka/main-codfw --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!
brouberol@kafka-main2006:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --describe
kafka-topics --zookeeper conf2004.codfw.wmnet,conf2005.codfw.wmnet,conf2006.codfw.wmnet/kafka/main-codfw --topic codfw.cirrussearch.update_pipeline.update.v1 --describe
Topic:codfw.cirrussearch.update_pipeline.update.v1	PartitionCount:5	ReplicationFactor:3	Configs:
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 0	Leader: 2002	Replicas: 2002,2003,2004	Isr: 2002,2003,2004
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 1	Leader: 2003	Replicas: 2003,2004,2001	Isr: 2003,2004,2001
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 2	Leader: 2004	Replicas: 2004,2001,2002	Isr: 2004,2001,2002
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 3	Leader: 2005	Replicas: 2005,2001,2002	Isr: 2005,2001,2002
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 4	Leader: 2001	Replicas: 2001,2002,2003	Isr: 2001,2002,2003

@brouberol thanks! something I forgot to mention and unsure if it's important, these topics are I think replicated to kafka-jumbo, do you think we should do the partitioning there too?

Hmm, as we're using mirrormaker v1, I'm not sure partition increases are replicated. Let me check, and adjust if required.

Good call! They were not.

brouberol@kafka-jumbo1014:~$ kafka topics --topic eqiad.cirrussearch.update_pipeline.update.v1 --describe
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --topic eqiad.cirrussearch.update_pipeline.update.v1 --describe
Topic:eqiad.cirrussearch.update_pipeline.update.v1	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: eqiad.cirrussearch.update_pipeline.update.v1	Partition: 0	Leader: 1011	Replicas: 1011,1008,1013	Isr: 1011,1008,1013
brouberol@kafka-jumbo1014:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --describe
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --topic codfw.cirrussearch.update_pipeline.update.v1 --describe
Topic:codfw.cirrussearch.update_pipeline.update.v1	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: codfw.cirrussearch.update_pipeline.update.v1	Partition: 0	Leader: 1009	Replicas: 1009,1011,1014	Isr: 1009,1011,1014
brouberol@kafka-jumbo1014:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 -- alter
brouberol@kafka-jumbo1014:~$ kafka topics --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --topic codfw.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!
brouberol@kafka-jumbo1014:~$ kafka topics --topic eqiad.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
kafka-topics --zookeeper conf1007.eqiad.wmnet,conf1008.eqiad.wmnet,conf1009.eqiad.wmnet/kafka/jumbo-eqiad --topic eqiad.cirrussearch.update_pipeline.update.v1 --partitions 5 --alter
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!
brouberol@kafka-jumbo1014:~$