Page MenuHomePhabricator

[Event Platform] mediawiki.page_content_change.v1 topic should be partitioned.
Closed, ResolvedPublic

Description

This follows from T338231: [Event Platform] mw-page-content-change-enrich should (re)produce kafka keys

We should partition the topic and spread the load across multiple brokers. See https://phabricator.wikimedia.org/T345657#9146529. Will require coordination with SRE to alter topic config https://wikitech.wikimedia.org/wiki/Kafka/Administration#Alter_topic_partitions_number

Success criteria

  • the mediawiki.page_content_change.v1 topic is partitioned

Dependencies

Details

Other Assignee
brouberol
TitleReferenceAuthorSource BranchDest Branch
requirements: version bump eventutilities.repos/data-engineering/mediawiki-event-enrichment!77gmodenaversion-bump-eventutilitiesmain
make: version bump mediawiki-event-utilities.repos/data-engineering/eventutilities-python!84gmodenamediawiki-event-utilities-version-bumpmain
Customize query in GitLab

Event Timeline

gmodena renamed this task from [NEEDS GROOMING] mediawiki.page_content_change.v1 topic should be partinioned. to [NEEDS GROOMING] mediawiki.page_content_change.v1 topic should be partitioned..Sep 7 2023, 8:40 AM
gmodena created this task.
gmodena renamed this task from [NEEDS GROOMING] mediawiki.page_content_change.v1 topic should be partitioned. to mediawiki.page_content_change.v1 topic should be partitioned..Sep 8 2023, 10:49 AM
gmodena removed gmodena as the assignee of this task.
gmodena updated the task description. (Show Details)

Would it be worth partitioning mediawiki.page_change.v1 too? Just so we can run multiple consumers to process for backfills if/when we need (cough cough T347676).

Would it be worth partitioning mediawiki.page_change.v1 too? Just so we can run multiple consumers to process for backfills if/when we need (cough cough T347676).

Probably? Needs SRE coordination because admin requires cli access to kafka brokers. That stream is easier to manager because messages are already produced with a partition key.

Picking this up. I'll do T338231 first, since is a requriement. Moving T338231 into this sprint.

Ahoelzl renamed this task from mediawiki.page_content_change.v1 topic should be partitioned. to [Event Platform] mediawiki.page_content_change.v1 topic should be partitioned..Oct 20 2023, 4:47 PM

@gmodena let's sync when you when to increase partition count, and I'll happily oblige!

Pick this up now that mediawiki-event-utilities 1.3.3 has been released.

I'll start by version bumping deps in the python wrapper and downstream enrichment app docker image.

Change 980359 had a related patch set uploaded (by Gmodena; author: Gmodena):

[operations/deployment-charts@master] mw-page-content-enrich: version bump.

https://gerrit.wikimedia.org/r/980359

Change 980359 merged by jenkins-bot:

[operations/deployment-charts@master] mw-page-content-enrich: version bump.

https://gerrit.wikimedia.org/r/980359

To apply these changes:

  • Suspended flink app: helmfile -e <env> apply --set app.job.state='suspended'
  • Alter kafka topic partitioning: kafka topics --alter --topic <topic> --partitions 3
  • Restart flink app: helmfile -e <env> apply
  • staging & kafka test eqiad. topic
  • eqiad & kafka jumbo eqiad. topic
  • codfw + kafka jumbo codfw. topic

Done!