Support per-db-shard concurrency in ChangeProp
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Pchelolo
	Mar 14 2018, 8:31 PM

Description

The parent task is about spiky connections to MySQL and the current theory is that the reason for that is the fact that ChangeProp only supports global concurrency limits, so when a big batch of jobs for a wiki from a particular shard comes, all the global concurrency is allocated to this particular DB shard and for a single shard it's too much - thus the 'smoothing' that concurrency limiting provides globally doesn't help on the database level.

In order to fix that, we need per-db-shard concurrency. In order to do that, we probably need to partition the topics where we need it by db shard, thus we need to solve T157822 first.

Other issues is that we need to create a custom partitioner that will be aware of the mediawiki-config dbname-shard naming, preferably without copy-pasting the shards mapping into the Event-Platform repo.

Last, ChangeProp should support per-partition concurrencies and (since partition names are just numbers in kafka) we need to integrate the db-partition mapper into change-prop as well somehow.

Details

	Subject	Repo	Branch	Lines +/-
	Make a special rule for refreshLinks partitioned execution.	mediawiki/services/change-propagation/jobqueue-deploy	master	+75 -4

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		• Pchelolo	T157088 [EPIC] Develop a JobQueue backend based on EventBus
Resolved		• Pchelolo	T183744 FY17/18 Q3 Program 8 Services Goal: Migrate two high-traffic jobs over to EventBus
Resolved		• Pchelolo	T185052 Migrate RefreshLinks job to kafka
Resolved	PRODUCTION ERROR	• Pchelolo	T189204 High (2-3x) write and connection load on enwiki databases
Resolved		• Pchelolo	T189738 Support per-db-shard concurrency in ChangeProp
Declined		Ottomata	T157822 Support multiple partitions per topic in EventBus
Resolved		• Pchelolo	T181221 Prepare and test ChangeProp with multi-partition topics
Resolved		Ottomata	T190196 Create <dc>.change-prop.partitioned.mediawiki.job.refreshLinks topic

Event Timeline

• Pchelolo triaged this task as High priority.Mar 14 2018, 8:31 PM

• Pchelolo created this task.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 14 2018, 8:31 PM

Ottomata moved this task from Incoming to Blocked on the Analytics board.Mar 15 2018, 4:27 PM

Ottomata moved this task from Blocked to Radar on the Analytics board.

• mobrovac added a subtask: T157822: Support multiple partitions per topic in EventBus.Mar 16 2018, 2:34 PM

Change 420841 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/services/change-propagation/jobqueue-deploy@master] Make a special rule for refreshLinks partitioned execution.

https://gerrit.wikimedia.org/r/420841

gerritbot added a project: Patch-For-Review.Mar 20 2018, 8:28 PM

• Pchelolo closed subtask T190196: Create <dc>.change-prop.partitioned.mediawiki.job.refreshLinks topic as Resolved.Mar 21 2018, 3:11 PM

Change 420841 merged by Mobrovac:
[mediawiki/services/change-propagation/jobqueue-deploy@master] Make a special rule for refreshLinks partitioned execution.

https://gerrit.wikimedia.org/r/420841

• mobrovac mentioned this in rMSCP2c930cac3913: Support partitioned topic for refreshLinks..Mar 21 2018, 3:46 PM

Mentioned in SAL (#wikimedia-operations) [2018-03-21T15:48:54Z] <ppchelko@tin> Started deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738

Mentioned in SAL (#wikimedia-operations) [2018-03-21T15:51:56Z] <ppchelko@tin> Finished deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738 (duration: 03m 03s)

Mentioned in SAL (#wikimedia-operations) [2018-03-21T15:53:16Z] <ppchelko@tin> Started deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2

Mentioned in SAL (#wikimedia-operations) [2018-03-21T15:53:56Z] <ppchelko@tin> Finished deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2 (duration: 00m 40s)

• Pchelolo closed subtask T157822: Support multiple partitions per topic in EventBus as Declined.Mar 21 2018, 6:22 PM

Deployed. Seem to be working fine, resolving.

• mobrovac added a parent task: T185052: Migrate RefreshLinks job to kafka.Mar 21 2018, 6:32 PM

Aklapper edited projects, added Analytics-Radar; removed Analytics.Jun 10 2020, 6:44 AM

Support per-db-shard concurrency in ChangePropClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Support per-db-shard concurrency in ChangeProp
Closed, ResolvedPublic
Actions

Related Objects
Search...