⚓ T255973 Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers

elukey created this task.Jun 22 2020, 8:49 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 22 2020, 8:49 AM

elukey moved this task from Backlog to Q1 2021/2022 on the Analytics-Clusters board.Jun 22 2020, 8:49 AM

Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.Jun 22 2020, 4:15 PM

Aklapper removed a project: Analytics.Jul 4 2020, 7:59 AM

Ottomata assigned this task to • razzi.Oct 13 2020, 7:35 PM

We should probably get T236327: replace onboard NIC in kafka-jumbo100[1-6] done before we do this.

Ottomata mentioned this in T267561: Beta needs to be upgraded to Varnish 6.Nov 10 2020, 6:42 PM

OO, @razzi this might be useful: https://github.com/DataDog/kafka-kit/tree/master/cmd/topicmappr

herron subscribed.Nov 16 2020, 4:31 PM

I tried that command you mentioned @Ottomata by copying the self-contained binary on to kafka-jumbo. Passing --brokers -2 means to apply for all brokers in cluster.

razzi@kafka-jumbo1002:~$ ./topicmappr rebuild --force-rebuild --brokers -2 --topics webrequest_text --zk-addr conf1004.eqiad.wmnet --zk-prefix kafka/jumbo-eqiad

Topics:
  webrequest_text

Broker change summary:
  New broker 1007
  New broker 1008
  New broker 1009
  -
  Replacing 0, added 3, missing 0, total count changed by 3

Action:
  Expanding/rebalancing topic with 3 additional broker(s) (this is a no-op unless --force-rebuild is specified)
  Force rebuilding map

Partition map changes:
  webrequest_text p0: [1006 1005 1001] -> [1002 1004 1009] replaced broker
  webrequest_text p1: [1002 1003 1004] -> [1007 1008 1001] replaced broker
  webrequest_text p2: [1005 1003 1006] -> [1001 1006 1005] replaced broker
  webrequest_text p3: [1001 1004 1006] -> [1009 1004 1003] replaced broker
  webrequest_text p4: [1003 1006 1002] -> [1005 1009 1002] replaced broker
  webrequest_text p5: [1004 1002 1003] -> [1003 1001 1008] replaced broker
  webrequest_text p6: [1006 1002 1005] -> [1004 1006 1003] replaced broker
  webrequest_text p7: [1002 1005 1003] -> [1006 1003 1007] replaced broker
  webrequest_text p8: [1005 1001 1003] -> [1008 1007 1001] replaced broker
  webrequest_text p9: [1001 1003 1004] -> [1002 1008 1004] replaced broker
  webrequest_text p10: [1003 1004 1006] -> [1007 1002 1006] replaced broker
  webrequest_text p11: [1004 1006 1002] -> [1001 1005 1008] replaced broker
  webrequest_text p12: [1006 1004 1002] -> [1009 1004 1002] replaced broker
  webrequest_text p13: [1002 1006 1005] -> [1005 1002 1009] replaced broker
  webrequest_text p14: [1005 1002 1003] -> [1003 1006 1005] replaced broker
  webrequest_text p15: [1001 1005 1003] -> [1004 1001 1008] replaced broker
  webrequest_text p16: [1003 1001 1004] -> [1006 1003 1005] replaced broker
  webrequest_text p17: [1004 1003 1006] -> [1008 1007 1001] replaced broker
  webrequest_text p18: [1006 1003 1004] -> [1002 1005 1009] replaced broker
  webrequest_text p19: [1002 1004 1006] -> [1007 1009 1003] replaced broker
  webrequest_text p20: [1005 1006 1002] -> [1001 1008 1004] replaced broker
  webrequest_text p21: [1001 1005 1003] -> [1009 1004 1002] replaced broker
  webrequest_text p22: [1003 1005 1001] -> [1005 1006 1001] replaced broker
  webrequest_text p23: [1004 1001 1003] -> [1003 1007 1006] replaced broker

Broker distribution:
  degree [min/max/avg]: 4/5/4.33 -> 5/7/6.00
  -
  Broker 1001 - leader: 3, follower: 6, total: 9
  Broker 1002 - leader: 3, follower: 5, total: 8
  Broker 1003 - leader: 3, follower: 5, total: 8
  Broker 1004 - leader: 2, follower: 6, total: 8
  Broker 1005 - leader: 3, follower: 5, total: 8
  Broker 1006 - leader: 2, follower: 6, total: 8
  Broker 1007 - leader: 3, follower: 4, total: 7
  Broker 1008 - leader: 2, follower: 6, total: 8
  Broker 1009 - leader: 3, follower: 5, total: 8

WARN:
  [none]

New partition maps:
  webrequest_text.json

So it looks like it comes up with a workable redistribution, and it's nice to have it enumerate all the changes. However, some reassignments don't share anything with the previous state:

webrequest_text p1: [1002 1003 1004] -> [1007 1008 1001] replaced broker

so it'd have to copy all the data to the new nodes, potentially even interrupting processing for any clients connected to the old nodes (I'm not sure if this is how clients work).

Wow that is very cool!

some reassignments don't share anything with the previous state:

Hm, I guess that's ok, data will be blasting around all over the place anyway.

potentially even interrupting processing for any clients connected to the old nodes

Naw, it should be fine. The replica mover just temporarily adds new replicas, so for a short time period this partition would have 6 replicas instead of 3. Once all the replicas are in sync, it will drop the original replicas, causing only the 3 new replicas to be in sync. Since the preferred leader replica will have changed, this should trigger a leader election and cause connected clients to reconnect to the new leader, in this case 1007.

• razzi mentioned this in T268074: Create kafka test cluster.Nov 17 2020, 8:43 PM

Ottomata mentioned this in T268202: Eq: 5 VM request for kafka-test-eqiad cluster.Nov 19 2020, 2:19 PM

Ottomata moved this task from Q1 2021/2022 to Q3 2020/2021 on the Analytics-Clusters board.Dec 14 2020, 4:45 PM

@Ottomata when I tested topicmappr before, I uploaded the binary directly onto the host; when we do this in production, will it make sense to debianize https://github.com/DataDog/kafka-kit/?

Hm, topicmappr just helps in generating the reassignment.json file, right? I think we can use it as a one off tool to generate the reassignment.json file without debianizing it.

Migration plan for partition rebalancing

Goal: all topics with > ~10 messages / second to have their partitions redistributed to include kafka-jumbo7-9.

This will be done in 3 parts: low traffic (< 100 m / s), medium traffic (100 - 1000 m / s), and high traffic (> 1000 m / s). This comment has only low traffic; whether medium and high traffic follow the same pattern will depend on how that goes.

To view kafka-jumbo topics sorted by traffic:

https://thanos.wikimedia.org/graph?g0.range_input=1h&g0.max_source_resolution=0s&g0.expr=sort_desc(topk(300%2C%20sum(irate(kafka_server_BrokerTopicMetrics_MessagesIn_total%7Bkafka_cluster%3D%22jumbo-eqiad%22%2C%20topic%3D~%22.%2B%22%7D%5B5m%5D))%20by%20(topic)))&g0.tab=1

Metrics to watch:

Is the producer buffer filling up? Maxes out at 720k, after which messages will be dropped.

https://grafana.wikimedia.org/d/000000253/varnishkafka?viewPanel=28&orgId=1

Is IO slowed down?

https://grafana.wikimedia.org/d/000000027/kafka?viewPanel=22&orgId=1

Part I: low-traffic topics

Low-traffic topic with 1 partition

eqiad.mediawiki.revision-create 16.686670667466828 messages / second

Contents of eqiad.mediawiki.revision-create.json

{"version":1,"partitions":[{"topic":"eqiad.mediawiki.revision-create","partition":0,"replicas":[1002,1007,1009]}]}

Apply with:

kafka-reassign-partitions --zookeeper conf1004.eqiad.wmnet,conf1005.eqiad.wmnet,conf1006.eqiad.wmnet/kafka/jumbo-eqiad --reassignment-json-file eqiad.mediawiki.revision-create.json --execute --throttle 10000000

Low traffic topic with 3 partitions

eqiad.resource_change 37.93333333333334 messages / second

Original json which migrates all partitions:

{"version":1,"partitions":[{"topic":"eqiad.resource_change","partition":0,"replicas":[1002,1009,1004]},{"topic":"eqiad.resource_change","partition":1,"replicas":[1007,1006,1002]},{"topic":"eqiad.resource_change","partition":2,"replicas":[1001,1003,1005]}]}

First, only migrating partition 0:

Contents of eqiad.resource_change.part1.json

{"version":1,"partitions":[{"topic":"eqiad.resource_change","partition":0,"replicas":[1002,1009,1004]}]}

kafka-reassign-partitions --zookeeper conf1004.eqiad.wmnet,conf1005.eqiad.wmnet,conf1006.eqiad.wmnet/kafka/jumbo-eqiad --reassignment-json-file eqiad.resource_change.part1.json --execute --throttle 10000000

Then migrating other 2:

Contents of eqiad.resource_change.part2.json

{"version":1,"partitions":[{"topic":"eqiad.resource_change","partition":1,"replicas":[1007,1006,1002]},{"topic":"eqiad.resource_change","partition":2,"replicas":[1001,1003,1005]}]}

kafka-reassign-partitions --zookeeper conf1004.eqiad.wmnet,conf1005.eqiad.wmnet,conf1006.eqiad.wmnet/kafka/jumbo-eqiad --reassignment-json-file eqiad.resource_change.part2.json --execute --throttle 10000000

Other low-traffic topics

Continue as above for the following low-traffic topics:

Name	messages / second
eqiad.wdqs-internal.sparql-query	90.78333333333333
eqiad.wdqs-external.sparql-query	82.43333333333334
eqiad.mediawiki.job.wikibase-addUsagesForPage	65.6
eqiad.mediawiki.job.htmlCacheUpdate	56.733333333333334
eqiad.mediawiki.job.cdnPurge	56.15
statsv	42.93333333333333
codfw.mediawiki.client.session_tick	31.333333333333332
codfw.mediawiki.api-request	28.91754350870174
eventlogging_PaintTiming	28.516666666666666
eventlogging_DesktopWebUIActionsTracking	28.416666666666668
eqiad.mediawiki.job.LocalGlobalUserPageCacheUpdateJob	26.03854104154164
eqiad.mediawiki.recentchange	25.466666666666665
eqiad.mediawiki.job.recentChangesUpdate	22.604520904180838
eventlogging_NavigationTiming	20.437420817496832
eventlogging_LayoutShift	19.783333333333335
eqiad.mediawiki.page-links-change	16.133333333333333
eventlogging_ResourceTiming	14.35
eqiad.mediawiki.revision-score	14.1
eventlogging_CpuBenchmark	13.45
eventlogging_InukaPageView	13.416666666666666
eventlogging_RUMSpeedIndex	13.4
eventlogging_MobileWikiAppDailyStats	12.816666666666666
eqiad.mediawiki.job.ORESFetchScoreJob	12.016666666666667
eventlogging_QuickSurveyInitiation	12
eventlogging_EditAttemptStep	10.816666666666666
eventlogging_CodeMirrorUsage	9.95
wdqs_streaming_updater_test	9.733333333333333
codfw.wdqs-external.sparql-query	8.866666666666667

Json files for each have been written to my home directory on kafka-jumbo1002 via the following command:

./topicmappr rebuild --force-rebuild --brokers -2 --topics '^eqiad\.wdqs-internal\.sparql-query$|^eqiad\.wdqs-external\.sparql-query$|^eqiad\.mediawiki\.job\.wikibase-addUsagesForPage$|^eqiad\.mediawiki\.job\.htmlCacheUpdate$|^eqiad\.mediawiki\.job\.cdnPurge$|^statsv$|^codfw\.mediawiki\.client\.session_tick$|^codfw\.mediawiki\.api-request$|^eventlogging_PaintTiming$|^eventlogging_DesktopWebUIActionsTracking$|^eqiad\.mediawiki\.job\.LocalGlobalUserPageCacheUpdateJob$|^eqiad\.mediawiki\.recentchange$|^eqiad\.mediawiki\.job\.recentChangesUpdate$|^eventlogging_NavigationTiming$|^eventlogging_LayoutShift$|^eqiad\.mediawiki\.page-links-change$|^eventlogging_ResourceTiming$|^eqiad\.mediawiki\.revision-score$|^eventlogging_CpuBenchmark$|^eventlogging_InukaPageView$|^eventlogging_RUMSpeedIndex$|^eventlogging_MobileWikiAppDailyStats$|^eqiad\.mediawiki\.job\.ORESFetchScoreJob$|^eventlogging_QuickSurveyInitiation$|^eventlogging_EditAttemptStep$|^eventlogging_CodeMirrorUsage$|^wdqs_streaming_updater_test$|^codfw\.wdqs-external\.sparql-query$' --zk-addr conf1004.eqiad.wmnet --zk-prefix kafka/jumbo-eqiad

Nice plan, I like the amount of details! Going to add a few suggestions/questions for you:

In the metrics to review I'd take into consideration clients that are not only varnishkafka, since the low traffic volume topics mentioned are mostly coming from Mirror Maker (so mirrored from the Kafka main cluster) and some of them directly from Eventlogging (on eventlog1002). It is very important in my opinion to know what are (some) producers/consumers of each topic that we'll change, so we'll have a quick way to judge if things go sideways for any reason.

the kafka-reassign-partitions can be rewritten in kafka reassign-partitions --reassignment-json-file eqiad.mediawiki.revision-create.json --execute --throttle 10000000, Andrew wrote a nice script that removes the need for zookeeper boilerplate configs etc.. (you can use it also for other kafka-related commands, type kafka on any kafka node to see the list of commands available). I am also curious about the --throttle 10000000, is there a specific reason for such high value?

What is the plan if when executing the topic partition moves something starts to error out (say a consumer etc..) ? Is there a rollback plan that we can quickly use? I am asking since if possible it would be nice to have it tested in kafka-test before starting the procedure :)

As first step, I'd execute the procedure on one/two topics and leave them running for a couple of hours with the new config, watching metrics in the meantime, just to make sure that nothing horrible happens on clients/producers after the maintenance (some bugs might pop up after some time, rather than immediately).

Nice stuff!

In the metrics to review I'd take into consideration clients that are not only varnishkafka

There are 4 main producers of data we should watch: varnishkafka, eventgate-analytics-external and eventgate-analytics, and eventlogging-processor on eventlog1002.

We can watch eventlogging-processor logs on eventlog1002 like:

sudo journalctl -f   -u eventlogging-processor@client-side-*.service

And eventgate-* app logs can be watched in logstash (this URL might change...today as they upgrade logstash...)

Aside from that I think we'll just have to watch the dashboards and especially pay attention to the Kafka broker iowait.

Json files for each have been written to my home directory on kafka-jumbo1002 via the following command:

OH! Cool topicmappr is nice. You've supplied all the topics but it generates individual reassignment files for each. I also just verified that the output for a given topic is the same whether or not you give it a list of topics or just one, so I assume the assignments it generates are deterministic every time.

To view kafka-jumbo topics sorted by traffic:

@razzi, if we balance an eqiad. prefixed topic, we should also balance its corresponding codfw. prefixed one. The codfw topic might not have any traffic in it now, but in the case of a datacenter switchover, it will then have the same throughput as the eqiad one.

Oh, and MirrorMaker too.

Mentioned in SAL (#wikimedia-operations) [2021-01-19T21:46:20Z] <ottomata> wiping kafka-test cluster data and starting from scratch - T255973

With @ottomatta we came up with a way to rollback a partition migration.
When applying a migration, it prints the current state, which can be used to migrate the partitions back,
however while a migration is running, trying to start another gives the error "There is an existing assignment running."

The way to cancel an in-progress migration is to execute the following commands:

zookeeper-shell zookeeper-test1002.eqiad.wmnet/kafka/test-eqiad
rmr /admin/reassign_partitions
rmr /kafka/test-eqiad/controller

At this point the rollback state can be applied, and since the migrations work by adding new replicas to the set of replicas, it will apply instantaneously if nodes it attempts to add are still in the In-Sync Replica (ISR) set.

Attempting to change the throttle gives the same error when there is an existing assignment running; the throttle can be removed using the following commands for each node:

kafka configs --alter --entity-type brokers --entity-name 1006 --delete-config leader.replication.throttled.rate
kafka configs --alter --entity-type brokers --entity-name 1006 --delete-config follower.replication.throttled.rate

FYI, the controller bounce idea we got from https://users.kafka.apache.narkive.com/epBsWAPC/stuck-re-balance

Migrated the following topics on kafka-jumbo:

codfw.mediawiki.revision-create
eqiad.mediawiki.revision-create

The migrations still to be run are on kafka-jumbo1002 in /home/razzi/rebalance-json.

Procedure:

To start a migration (this uses a throttle of 10 MB/s):

kafka reassign-partitions --reassignment-json-file <topic>.json --execute --throttle 10000000

This will print the current partition assignment which can be used to rollback a migration.
Copy and paste this into ~/rebalance-json/reverts/REVERT-<topic>.json.

To check the progress, compare the topic size on disk on a node that was previously a replica with a node that is being added to be a replica. For example:

razzi@kafka-jumbo1002:~/rebalance-json$ du -sh /srv/kafka/data/eqiad.mediawiki.revision-create-0/
21G     /srv/kafka/data/eqiad.mediawiki.revision-create-0/

razzi@kafka-jumbo1009:~/rebalance-json$ du -sh /srv/kafka/data/eqiad.mediawiki.revision-create-0/
13G     /srv/kafka/data/eqiad.mediawiki.revision-create-0/

To check if a migration has finished (also removes the throttle if it has):

kafka reassign-partitions --reassignment-json-file eqiad.mediawiki.revision-create.json --verify

Once a topic is done, move it to ~/rebalance-json/done/. In this way the directory ~/rebalance-json/ serves as a sort of to-do list.

One more useful command: to change the throttle rate, run the on the node data is coming from and the node the data is going to. For example, if data is being copied from kafka-jumbo1003 to kafka-jumbo1009:

$ kafka configs --alter --entity-type brokers --entity-name 1003 --add-config leader.replication.throttled.rate=10000000,follower.replication.throttled.rate=10000000

$ kafka configs --alter --entity-type brokers --entity-name 1009 --add-config leader.replication.throttled.rate=10000000,follower.replication.throttled.rate=10000000

• razzi updated the task description. (Show Details)Jan 21 2021, 10:18 PM

• razzi updated the task description. (Show Details)Jan 25 2021, 6:56 PM

• razzi updated the task description. (Show Details)Jan 25 2021, 8:39 PM

• razzi updated the task description. (Show Details)Jan 25 2021, 8:44 PM

• razzi updated the task description. (Show Details)Jan 25 2021, 9:35 PM

• razzi updated the task description. (Show Details)Jan 26 2021, 7:25 PM

• razzi updated the task description. (Show Details)Jan 26 2021, 9:32 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 5:04 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 5:50 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 6:42 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 6:59 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 7:14 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 9:02 PM

• razzi updated the task description. (Show Details)Jan 28 2021, 10:01 PM

• razzi updated the task description. (Show Details)Jan 29 2021, 7:40 PM

• razzi updated the task description. (Show Details)Jan 29 2021, 10:53 PM

As we get into the higher-volume topics, we are seeing some alerts about replica max lag and under-replicated partions. As I continue to run migrations, those alerts should be disabled for a few hours at a time and the metrics should be observed manually in Grafana.

• razzi updated the task description. (Show Details)Feb 1 2021, 7:04 PM

• razzi updated the task description. (Show Details)Feb 1 2021, 10:21 PM

elukey mentioned this in T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].Feb 2 2021, 8:19 AM

• razzi updated the task description. (Show Details)Feb 2 2021, 10:13 PM

• razzi updated the task description. (Show Details)Feb 3 2021, 8:04 PM

• razzi updated the task description. (Show Details)Feb 4 2021, 7:33 PM

• razzi updated the task description. (Show Details)Feb 5 2021, 8:34 PM

• razzi updated the task description. (Show Details)Feb 8 2021, 4:24 PM

• razzi updated the task description. (Show Details)Feb 9 2021, 3:32 PM

• razzi updated the task description. (Show Details)Feb 10 2021, 5:55 PM

• razzi updated the task description. (Show Details)Feb 10 2021, 11:04 PM

• razzi updated the task description. (Show Details)Feb 12 2021, 6:31 PM

• razzi updated the task description. (Show Details)Feb 12 2021, 6:41 PM

• razzi updated the task description. (Show Details)Feb 17 2021, 5:36 PM

Ok! Now that we're on to the final and highest traffic topics, webrequest_upload and webrequest_text, we're switching to migrating one partition at a time. Here are the full migrations plans, in case they get modified in the process.

webrequest_upload.json:

{"version":1,"partitions":[{"topic":"webrequest_upload","partition":0,"replicas":[1003,1004,1008]},{"topic":"webrequest_upload","partition":1,"replicas":[1004,1003,1008]},{"topic":"webrequest_upload","partition":2,"replicas":[1006,1007,1002]},{"topic":"webrequest_upload","partition":3,"replicas":[1008,1007,1001]},{"topic":"webrequest_upload","partition":4,"replicas":[1002,1008,1004]},{"topic":"webrequest_upload","partition":5,"replicas":[1007,1006,1003]},{"topic":"webrequest_upload","partition":6,"replicas":[1001,1005,1006]},{"topic":"webrequest_upload","partition":7,"replicas":[1009,1001,1007]},{"topic":"webrequest_upload","partition":8,"replicas":[1005,1009,1003]},{"topic":"webrequest_upload","partition":9,"replicas":[1003,1002,1009]},{"topic":"webrequest_upload","partition":10,"replicas":[1004,1001,1009]},{"topic":"webrequest_upload","partition":11,"replicas":[1006,1004,1001]},{"topic":"webrequest_upload","partition":12,"replicas":[1008,1007,1002]},{"topic":"webrequest_upload","partition":13,"replicas":[1002,1005,1006]},{"topic":"webrequest_upload","partition":14,"replicas":[1007,1008,1002]},{"topic":"webrequest_upload","partition":15,"replicas":[1001,1006,1005]},{"topic":"webrequest_upload","partition":16,"replicas":[1009,1004,1001]},{"topic":"webrequest_upload","partition":17,"replicas":[1005,1003,1008]},{"topic":"webrequest_upload","partition":18,"replicas":[1003,1002,1007]},{"topic":"webrequest_upload","partition":19,"replicas":[1004,1009,1003]},{"topic":"webrequest_upload","partition":20,"replicas":[1006,1001,1004]},{"topic":"webrequest_upload","partition":21,"replicas":[1008,1003,1005]},{"topic":"webrequest_upload","partition":22,"replicas":[1002,1007,1006]},{"topic":"webrequest_upload","partition":23,"replicas":[1007,1002,1009]}]}

webrequest_text.json:

{"version":1,"partitions":[{"topic":"webrequest_text","partition":0,"replicas":[1008,1001,1004]},{"topic":"webrequest_text","partition":1,"replicas":[1002,1005,1008]},{"topic":"webrequest_text","partition":2,"replicas":[1007,1009,1002]},{"topic":"webrequest_text","partition":3,"replicas":[1001,1006,1005]},{"topic":"webrequest_text","partition":4,"replicas":[1009,1002,1007]},{"topic":"webrequest_text","partition":5,"replicas":[1005,1008,1001]},{"topic":"webrequest_text","partition":6,"replicas":[1003,1007,1009]},{"topic":"webrequest_text","partition":7,"replicas":[1004,1003,1006]},{"topic":"webrequest_text","partition":8,"replicas":[1006,1001,1007]},{"topic":"webrequest_text","partition":9,"replicas":[1008,1004,1003]},{"topic":"webrequest_text","partition":10,"replicas":[1002,1008,1004]},{"topic":"webrequest_text","partition":11,"replicas":[1007,1002,1009]},{"topic":"webrequest_text","partition":12,"replicas":[1001,1007,1008]},{"topic":"webrequest_text","partition":13,"replicas":[1009,1003,1005]},{"topic":"webrequest_text","partition":14,"replicas":[1005,1009,1002]},{"topic":"webrequest_text","partition":15,"replicas":[1003,1005,1006]},{"topic":"webrequest_text","partition":16,"replicas":[1004,1006,1003]},{"topic":"webrequest_text","partition":17,"replicas":[1006,1002,1007]},{"topic":"webrequest_text","partition":18,"replicas":[1008,1003,1004]},{"topic":"webrequest_text","partition":19,"replicas":[1002,1005,1009]},{"topic":"webrequest_text","partition":20,"replicas":[1007,1008,1001]},{"topic":"webrequest_text","partition":21,"replicas":[1001,1009,1005]},{"topic":"webrequest_text","partition":22,"replicas":[1009,1001,1005]},{"topic":"webrequest_text","partition":23,"replicas":[1005,1006,1002]}]}

Each partition can be split into a migration such as:

webrequest_upload_0.json

{"version":1,"partitions":[{"topic":"webrequest_upload","partition":0,"replicas":[1003,1004,1008]}]}

Nice.

I'm thinking of writing up the steps for rebalancing partitions in a wiki article such as https://wikitech.wikimedia.org/wiki/Kafka/Administration, and I'm reminded of how I scp'd the topicmappr executable to kafka-jumbo1002 and how that's hacky. Should we make a plan to properly package topicmappr?

If it isn't hard it couldn't hurt!

Although, it is a go library, which I'm not sure we have much tooling around dealing with. I think maybe @ema has made some Go based .debs before?

Ottomata mentioned this in T273702: Modify Kafka max replica lag alert to only alert if increasing.Feb 18 2021, 9:21 PM

• razzi updated the task description. (Show Details)Feb 19 2021, 6:08 PM

• razzi added a project: Analytics-Kanban.Mar 2 2021, 6:30 PM

• razzi moved this task from Next Up to In Progress on the Analytics-Kanban board.

• razzi updated the task description. (Show Details)Mar 3 2021, 5:09 PM

In T255973#6840918, @Ottomata wrote:

Although, it is a go library, which I'm not sure we have much tooling around dealing with. I think maybe @ema has made some Go based .debs before?

Hey! Yes we do have some Golang programs debianized, see for example atskafka.

Thanks for your comment @ema. I'll see if I can do the same with kafka-kit.

• razzi updated the task description. (Show Details)Mar 24 2021, 6:18 PM

• razzi updated the task description. (Show Details)Mar 30 2021, 5:22 PM

kafka-kit depends on other packages such as https://tracker.debian.org/pkg/golang-github-zorkian-go-datadog-api, which is not available in Buster but is available in Bullseye. We could backport all these packages but it would be nontrivial and potentially duplicative of effort.

In T255973#6959367, @razzi wrote:

kafka-kit depends on other packages such as https://tracker.debian.org/pkg/golang-github-zorkian-go-datadog-api, which is not available in Buster but is available in Bullseye. We could backport all these packages but it would be nontrivial and potentially duplicative of effort.

For go packages we tend to follow this rule - use the Debian deps as much as possible, and include what's missing in the package itself. I agree that importing/packaging dependencies it is too much, so it is fine to just add them to the deb :)

• razzi moved this task from In Progress to Done on the Analytics-Kanban board.Apr 19 2021, 3:01 PM

Prometheus doesn't seem to like long range queries, so I can't show more than 30 days back, but we can see the topic data difference converging across all jumbo brokers:
https://grafana.wikimedia.org/d/000000027/kafka?viewPanel=38&orgId=1&from=now-30d&to=now

Screen Shot 2021-04-19 at 11.22.14.png (2×4 px, 850 KB)

Very cool!

Ottomata moved this task from Q3 2020/2021 to Done on the Analytics-Clusters board.Apr 19 2021, 3:54 PM

• fdans closed this task as Resolved.Apr 29 2021, 2:56 PM

Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers
Closed, ResolvedPublic
Actions

Description

Related Objects

Event Timeline

Migration plan for partition rebalancing

Part I: low-traffic topics

Procedure:

	F34407661: Screen Shot 2021-04-19 at 11.22.14.png
	Apr 19 2021, 3:23 PM

Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokersClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

Migration plan for partition rebalancing

Part I: low-traffic topics

Procedure:

Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers
Closed, ResolvedPublic
Actions