Page MenuHomePhabricator

Consider using keyspace aggregations for RESTBase Cassandra metrics
Closed, DeclinedPublic

Description

This pull request adds the o.a.c.metrics.Keyspace hierarchy to those collected by cassandra-metrics-collector, (o.a.c.metrics.Keyspace is an aggregated version of o.a.c.metrics.ColumnFamily). This gerrit proposes to filter them out before that change lands in production.

However, we recently filtered out most of the column family metrics for the RESTBase meta tables, leaving only those that might help detect an aberrant access pattern. For RESTBase where we only have these two tables, data that we are always interested in, and meta that we are mostly not interested in, it might make sense to simply rely on the keyspace aggregations, and filter out o.a.c.metrics.ColumnFamily entirely.

Pros: It would simplify metric filtering (not that it's urgently in need of simplifying). reduce the number of collected metrics some (by whatever number meta currently requires), and it would simplify the dashboards (we almost never use the column family templating).

Cons: We'd lose the ability to separately view rate and latency by column family (duh), and it would break existing dashboards and create a disjoint history.

Event Timeline

If I understand you right, you are proposing to aggregate metrics per keyspace, rather than columnfamily?

If so, then this would make it hard to debug secondary indexes, like those defined on the revision table. Those are modeled as columnfamilies in the same keyspace, and mixing them with metrics from the data keyspace wouldn't be very helpful.

If I understand you right, you are proposing to aggregate metrics per keyspace, rather than columnfamily?

If so, then this would make it hard to debug secondary indexes, like those defined on the revision table. Those are modeled as columnfamilies in the same keyspace, and mixing them with metrics from the data keyspace wouldn't be very helpful.

True.

@Eevans, with the latest plans we will have several distinct tables for revision retention policies as well, which makes aggregation by keyspace problematic as well. Should we decline this task?

GWicke edited projects, added Services (later); removed Services.
GWicke moved this task from later to watching on the Services board.
GWicke edited projects, added Services (watching); removed Services (later).