Maniphest T143056

Address abnormally wide partitions
Closed, DuplicatePublic
Actions

Assigned To

Authored By

	Eevans
	Aug 15 2016, 10:08 PM

Tags

Referenced Files

	F4385666: partitions_between_5g_and_10g.txt
	Aug 22 2016, 7:36 PM

	F4385597: large_partitions_sorted_scrubbed.txt
	Aug 22 2016, 7:36 PM

	F4385667: partitions_larger_than_10g.txt
	Aug 22 2016, 7:36 PM

	F4385595: create_cql
	Aug 22 2016, 7:36 PM

	F4385582: delete_1-to-5.cql
	Aug 22 2016, 7:36 PM

	F4385668: partitions_between_1g_and_5g.txt
	Aug 22 2016, 7:36 PM

	F4385581: delete_5-to-10.cql
	Aug 22 2016, 7:36 PM

	F4385598: large_partitions.log
	Aug 22 2016, 7:36 PM

View All 11 Files

Subscribers

Description

We have (have had for some time) abnormally wide partitions in Cassandra. These are the source of a number of problems, not least of which are fatally large heap allocations that result in OOMs when read.

We should a) find those that currently exist and clean them up, and b) put in place the means to proactively identify them moving forward.

First pass

	Count	Description
delete_big.cql8 KBDownload	18	Partitions larger than 10G in size
delete_5-to-10.cql13 KBDownload	30	> 5G and <= 10G in size
delete_1-to-5.cql301 KBDownload	653	> 1G and <= 5G in size

Working files

create_cql4 KBDownload

large_partitions_sorted_scrubbed.txt38 MBDownload

large_partitions.log100 MBDownload

(raw log entries)

partitions_larger_than_10g.txt1 KBDownload

partitions_between_5g_and_10g.txt2 KBDownload

partitions_between_1g_and_5g.txt63 KBDownload

Related Objects

Mentioned In: T133091: Highest SSTables / read thresholds
Mentioned Here: P3853 Masterwork From Distant Lands
P3848 delete_5-to-10.cql
P3845 delete_gte_10g.cql
P3843 Partitions between 5G and 10G in size
P3844 Partitions larger than 10G in size

Event Timeline

Eevans created this task.Aug 15 2016, 10:08 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 15 2016, 10:08 PM

Peachey88 added projects: Cassandra, SRE.Aug 15 2016, 10:12 PM

Eevans moved this task from Backlog to In-Progress on the Cassandra board.Aug 17 2016, 6:51 PM

Eevans updated the task description. (Show Details)Aug 17 2016, 8:51 PM

Script to delete the partitions >= 10G.

delete_big.cql8 KBDownload

(applied w/ cqlsh -f delete_big.cql)

Script to delete the partitions >= 5G and < 10G.

delete_5-to-10.cql13 KBDownload

(applied w/ cqlsh -f delete_5-to-10.cql)

Eevans updated the task description. (Show Details)Aug 22 2016, 7:36 PM

Eevans updated the task description. (Show Details)Aug 23 2016, 9:28 PM

Eevans triaged this task as Medium priority.Aug 23 2016, 9:31 PM

Eevans merged a task: T94121: Understand and solve wide row issues for frequently edited and re-rendered pages.Sep 20 2016, 8:33 PM

Eevans added subscribers: • GWicke, StudiesWorld.

Eevans mentioned this in T133091: Highest SSTables / read thresholds.Sep 27 2016, 7:18 PM

Eevans closed this task as a duplicate of T94121: Understand and solve wide row issues for frequently edited and re-rendered pages.Oct 4 2016, 9:12 PM