Page MenuHomePhabricator

Upgrade kafka-logging to version 3.7
Closed, ResolvedPublic

Description

High level steps (summarized from docs in T416669: Upgrade Kafka to version 3.x)

Kafka logging has 2 clusters, one for each site of eqiad and codfw.

  • Kafka-logging eqiad:
    • Pin the inter broker protocol version on the brokers to hieradata/role/common/kafka/logging.yaml:profile::kafka::broker::inter_broker_protocol_version: 1.1.0
    • Perform a rolling upgrade of the brokers, that will restart with the pinned version configurations and the new kafka version, using host-by-host patches and service restart of kafka broker, e.g. https://gerrit.wikimedia.org/r/c/operations/puppet/+/1273863
      • kafka-logging1001
      • kafka-logging1002
      • kafka-logging1003
      • kafka-logging1004
      • kafka-logging1005
    • Change the inter broker protocol version to match the new kafka version
      • Set hieradata/role/common/kafka/logging.yaml:profile::kafka::broker::inter_broker_protocol_version: 3.7
    • Perform a final rolling restart of the brokers
  • Kafka-logging codfw:
    • Pin the inter broker protocol version on the brokers to hieradata/role/common/kafka/logging.yaml:profile::kafka::broker::inter_broker_protocol_version: 1.1.0
    • Perform a rolling upgrade of the brokers, that will restart with the pinned version configurations and the new kafka version, using host-by-host patches and service restart of kafka broker, e.g. https://gerrit.wikimedia.org/r/c/operations/puppet/+/1273863
      • kafka-logging2001
      • kafka-logging2002
      • kafka-logging2003
      • kafka-logging2004
      • kafka-logging2005
    • Change the inter broker protocol version to match the new kafka version
      • Set hieradata/role/common/kafka/logging.yaml:profile::kafka::broker::inter_broker_protocol_version: 3.7
    • Perform a final rolling restart of the brokers

Post-upgrade cleanup:

  • Clean up per-host hiera configs, move to role level

Event Timeline

herron triaged this task as High priority.
herron updated the task description. (Show Details)

Change #1273863 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] kafka-logging: update kafka-logging2001 confluent distro to 77

https://gerrit.wikimedia.org/r/1273863

cc @elukey as he was the one working on the puppet side of things. It does look good to me though!

@herron I would change a thing - I think it is sufficient to upgrade a single host (like https://gerrit.wikimedia.org/r/c/operations/puppet/+/1273863), leave it there for a couple of days to catch any issues and then upgrade the rest in one go with the cookbook (it doesn't support single hosts or subsets at the moment). It should be very safe, not real difference with the host-by-host approach that may be more tedious and error prone. But I'll leave the choice to you, the plan looks ok as well!

Let us know when you do it so we'll be able to assist/help if needed.

@herron I would change a thing - I think it is sufficient to upgrade a single host (like https://gerrit.wikimedia.org/r/c/operations/puppet/+/1273863), leave it there for a couple of days to catch any issues and then upgrade the rest in one go with the cookbook (it doesn't support single hosts or subsets at the moment). It should be very safe, not real difference with the host-by-host approach that may be more tedious and error prone. But I'll leave the choice to you, the plan looks ok as well!

Let us know when you do it so we'll be able to assist/help if needed.

Sounds good! I'll go ahead with kafka-logging2001 today as the first single host and let that settle in for a few days.

Change #1273863 merged by Herron:

[operations/puppet@production] kafka-logging: update kafka-logging2001 confluent distro to 77

https://gerrit.wikimedia.org/r/1273863

Change #1275932 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] kafka-logging: set all codfw brokers to confluent_distribution 77

https://gerrit.wikimedia.org/r/1275932

Change #1275932 merged by Herron:

[operations/puppet@production] kafka-logging: set all codfw brokers to confluent_distribution 77

https://gerrit.wikimedia.org/r/1275932

Cookbook worked well!

END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.

Change #1276745 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] kafka-logging: set codfw brokers inter-broker protocol to 3.7

https://gerrit.wikimedia.org/r/1276745

Change #1276745 merged by Herron:

[operations/puppet@production] kafka-logging: set codfw brokers inter-broker protocol to 3.7

https://gerrit.wikimedia.org/r/1276745

herron renamed this task from Upgrade kafka-logging to version 3.x to Upgrade kafka-logging to version 3.7.Apr 27 2026, 3:17 PM
herron updated the task description. (Show Details)
herron updated the task description. (Show Details)

Change #1277581 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] kafka-logging: set eqiad (and all) brokers to confluent distro 77

https://gerrit.wikimedia.org/r/1277581

Change #1277581 merged by Herron:

[operations/puppet@production] kafka-logging: set eqiad (and all) brokers to confluent distro 77

https://gerrit.wikimedia.org/r/1277581

Change #1278489 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] kafka-logging: set eqiad (and all) brokers to protocol 3.7

https://gerrit.wikimedia.org/r/1278489

Change #1278489 merged by Herron:

[operations/puppet@production] kafka-logging: set eqiad (and all) brokers to protocol 3.7

https://gerrit.wikimedia.org/r/1278489

All kafka-logging brokers have been upgraded to 3.7