Page MenuHomePhabricator

Cassandra query tracing
Open, MediumPublic

Description

Cassandra supports query tracing to enable detailed performance analysis; You can configure a Cassandra node to probabilistically trace queries, and then interrogate the system_traces keyspace later to identify sessions (by duration) and the events associated with them. As great as this would seem to be, the query itself isn't a part of what is logged when prepared queries are used, making it impossible to match queries in the application to session traces.

Another method of producing traces, is to request that a query be traced using the driver (when performing the query). This returns a session ID that could be logged for later interrogation of the trace tables, or used to retrieve and log the entire trace. Here is an example using the nodejs driver:

1"use strict";
2
3
4const P = require('bluebird');
5const cassandra = require("cassandra-driver");
6const client = new cassandra.Client({ contactPoints: ['localhost'] });
7
8
9client.connect()
10.then(function() {
11 return client.execute(
12 "CREATE KEYSPACE IF NOT EXISTS db WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1' }");
13})
14.then(function() {
15 return client.execute('CREATE TABLE IF NOT EXISTS db.data (id text PRIMARY KEY, first text, last text)');
16})
17.then(function() {
18 return client.execute("INSERT INTO db.data (id,first,last) VALUES ('jericevans', 'Eric', 'Evans')");
19})
20.then(function() {
21 return client.execute('SELECT * FROM db.data WHERE id = ?', ['jericevans'], { traceQuery: true });
22})
23.then(function(res) {
24 console.log(' RESULT ....:', res.rows[0]);
25 return P.resolve(res.info.traceId);
26})
27.then(function(traceId) {
28 console.log('TRACE ID ....:', traceId)
29 return client.metadata.getTrace(traceId);
30})
31.then(function(trace) {
32 console.log(trace);
33 process.exit(0);
34})
35.catch(function(err) {
36 console.error('Whoops; Error:', err);
37 client.shutdown();
38});

1$ node traced.js
2 RESULT ....: Row { id: 'jericevans', first: 'Eric', last: 'Evans' }
3TRACE ID ....: Uuid: 8b037420-f86a-11e6-8ec6-6bf5aa313f86
4{ requestType: 'Execute CQL3 query',
5 coordinator: InetAddress: 127.0.0.1,
6 parameters:
7 { consistency_level: 'LOCAL_ONE',
8 page_size: '5000',
9 query: 'SELECT * FROM db.data WHERE id = ?',
10 serial_consistency_level: 'SERIAL' },
11 startedAt: Tue Feb 21 2017 13:18:36 GMT-0600 (CST),
12 duration: 13063,
13 clientAddress: InetAddress: 127.0.0.1,
14 events:
15 [ { id: TimeUuid: 8b037421-f86a-11e6-8ec6-6bf5aa313f86,
16 activity: 'Parsing SELECT * FROM db.data WHERE id = ?',
17 source: InetAddress: 127.0.0.1,
18 elapsed: 176,
19 thread: 'SharedPool-Worker-1' },
20 { id: TimeUuid: 8b037422-f86a-11e6-8ec6-6bf5aa313f86,
21 activity: 'Preparing statement',
22 source: InetAddress: 127.0.0.1,
23 elapsed: 270,
24 thread: 'SharedPool-Worker-1' },
25 { id: TimeUuid: 8b037423-f86a-11e6-8ec6-6bf5aa313f86,
26 activity: 'Executing single-partition query on data',
27 source: InetAddress: 127.0.0.1,
28 elapsed: 670,
29 thread: 'SharedPool-Worker-2' },
30 { id: TimeUuid: 8b037424-f86a-11e6-8ec6-6bf5aa313f86,
31 activity: 'Acquiring sstable references',
32 source: InetAddress: 127.0.0.1,
33 elapsed: 709,
34 thread: 'SharedPool-Worker-2' },
35 { id: TimeUuid: 8b037425-f86a-11e6-8ec6-6bf5aa313f86,
36 activity: 'Merging memtable tombstones',
37 source: InetAddress: 127.0.0.1,
38 elapsed: 751,
39 thread: 'SharedPool-Worker-2' },
40 { id: TimeUuid: 8b039b30-f86a-11e6-8ec6-6bf5aa313f86,
41 activity: 'Key cache hit for sstable 1',
42 source: InetAddress: 127.0.0.1,
43 elapsed: 882,
44 thread: 'SharedPool-Worker-2' },
45 { id: TimeUuid: 8b039b31-f86a-11e6-8ec6-6bf5aa313f86,
46 activity: 'Seeking to partition beginning in data file',
47 source: InetAddress: 127.0.0.1,
48 elapsed: 897,
49 thread: 'SharedPool-Worker-2' },
50 { id: TimeUuid: 8b039b32-f86a-11e6-8ec6-6bf5aa313f86,
51 activity: 'Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones',
52 source: InetAddress: 127.0.0.1,
53 elapsed: 1452,
54 thread: 'SharedPool-Worker-2' },
55 { id: TimeUuid: 8b039b33-f86a-11e6-8ec6-6bf5aa313f86,
56 activity: 'Merging data from memtables and 1 sstables',
57 source: InetAddress: 127.0.0.1,
58 elapsed: 1478,
59 thread: 'SharedPool-Worker-2' },
60 { id: TimeUuid: 8b039b34-f86a-11e6-8ec6-6bf5aa313f86,
61 activity: 'Read 1 live and 0 tombstone cells',
62 source: InetAddress: 127.0.0.1,
63 elapsed: 1518,
64 thread: 'SharedPool-Worker-2' } ] }
65

Considerations:

  • Tracing results in a great deal of write amplification, and so should be used sparingly
    • Probabilistically
    • On a per table basis
    • Adaptively? (think: when average query latency climbs above a threshold)
  • Logging only the session ID would be lighter-weight, but is less convenient (spreads the whole picture over two places)
  • Logging the entire trace would be more convenient (one-stop shopping)
    • ...but amplifies reads
    • ...increases logstash payloads (prohibitively?)

See also:

Event Timeline

Eevans created this task.Feb 21 2017, 7:43 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2017, 7:43 PM
Eevans triaged this task as Medium priority.Feb 21 2017, 7:43 PM
Eevans updated the task description. (Show Details)Feb 21 2017, 8:01 PM

How abut logging the entire trace only for the queries we are interested in locally to disk or syslog?

How abut logging the entire trace only for the queries we are interested in locally to disk or syslog?

Yeah, that's possible, though logging it locally means that you still need to chase down the right machine (in this case the node that issued the query, as opposed to the Cassandra coordinator), so not the most convenient. And, logging to syslog for the purposes of having them centrally seems like it could suffer from the same problems as logstash (potentially high in volume, large messages, etc).

Perhaps if we could exploit JMX for this, we could make a custom logger that (a) logs the session ID and node to logstash; and (b) logs the full trace to disk. Too optimistic?

Perhaps if we could exploit JMX for this, we could make a custom logger that (a) logs the session ID and node to logstash; and (b) logs the full trace to disk. Too optimistic?

What would JMX be used for here?

What would JMX be used for here?

Brain fart. Disregard. I was thinking in the sense of having more control over logging than what the std Cass logger has, but I guess we can always write our own logging facility that

... (a) logs the session ID and node to logstash; and (b) logs the full trace to disk. Too optimistic?

What would JMX be used for here?

Brain fart. Disregard. I was thinking in the sense of having more control over logging than what the std Cass logger has, but I guess we can always write our own logging facility that

Oh, I think I understand where you going with this; Query tracing in Cassandra doesn't use any of the existing logging facilities, it writes out events as rows into the system_traces tables, linked by globally unique session IDs.

So the options I mentioned are a) driver initiates the trace of a query, and logs the returned session ID, or b) driver initiates a trace, and uses the returned session ID to lookup the event data, and logs that.

Option (a) requires that you take the session ID (presumably from a Kibana dashboard), and go have a look at the trace data "elsewhere" (cqlsh perhaps), and option (b) would provide all relevant information in one place. Option (a) is lightweight but less convenient, option (b) adds a CQL read (the client.metadata.getTrace()), and will inflate the size of logstash messages (possibly considerably).

I lean toward (b), providing that the worst-case scenario for trace frequency and logstash payload size don't put us at risk for problems (and that the Metadata#getTrace invocation is done in the background). That said: it might make sense to just start with the session ID (to give us something), and expand on it later.

Eevans updated the task description. (Show Details)Feb 23 2017, 8:05 PM

Ok I see. So how about having a mix: logging the trace ID and host name in logstash and having the trace either in the trace table or on disk somewhere? That wouldn't saturate logstash while still allowing us to (relatively) easily look it up?

Eevans updated the task description. (Show Details)Aug 16 2018, 3:47 PM
Aklapper removed Eevans as the assignee of this task.Jun 19 2020, 4:21 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)