Page MenuHomePhabricator

Collect droppable tombstone ratio metrics
Closed, ResolvedPublic

Description

The droppable tombstone ratio is shaping up to be an important metric for us. We should collect, persist, and configure dashboards for it.

Event Timeline

A couple of problems with this:

a) The metric is not included in o.a.cassandra.metrics, so as things stand, we cannot get at it with cassandra-metrics-collector, and b) each query is roughly equivalent to invoking sstablemetadata against each file, so reading it every collection interval would be too expensive (it doesn't change often enough to warrant that anyway). So for the time being, I will collect some ad hoc results using something like the following

1#!/bin/bash
2
3# https://phabricator.wikimedia.org/P3825
4
5set -ex
6
7
8export PATH="$PATH:~eevans/c-commands"
9
10connection_string()
11{
12 printf "localhost:%d" $(uyaml /etc/cassandra-instances.d/`hostname`-"$1".yaml /jmx_port)
13}
14
15bean()
16{
17 printf "org.apache.cassandra.db:type=ColumnFamilies,keyspace=%s,columnfamily=data" "$1"
18}
19
20keyspaces()
21{
22 c-cqlsh a -e "describe keyspaces" | sed "s/\"//g" | grep -E "^local_group_"
23}
24
25finish()
26{
27 rm -f $OUTPUT
28}
29
30trap finish EXIT
31
32OUTPUT=${1:-`mktemp`}
33
34for i in `keyspaces`; do
35 for j in `c-ls`; do
36 ratio=$(sjk mx -s `connection_string $j` -b `bean "$i"` -mg -f DroppableTombstoneRatio |tail -n 1)
37 printf "%s-%s,%s,data,%.3f\n" `hostname` "$j" "$i" $ratio >> "$OUTPUT"
38 done
39done
40
41install -d ~/reports
42mv "$OUTPUT" ~/reports/droppable_tombstones_`date -Idate`.csv
43

A couple of problems with this:

a) The metric is not included in o.a.cassandra.metrics, so as things stand, we can get at it with cassandra-metrics-collector, and b) each query is roughly equivalent to invoking sstablemetadata against each file, so reading it every collection interval would be too expensive (it doesn't change often enough to warrant that anyway). So for the time being, I will collect some ad hoc results using something like the following

[ ... ]

First report attached here:

I think the approach mentioned here could work for this as well. Collect data weekly and generate a summary to email to services@.

GWicke edited projects, added Services (later); removed Services.

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)

Eevans lowered the priority of this task from Medium to Low.Jun 7 2021, 7:53 PM
Eevans claimed this task.

I think our latest dashboards cover this scenario; Closing