Maniphest T191660

Script to collect forensic data from Cassandra hosts
Open, MediumPublic
Actions

Assigned To

None

Authored By

	Eevans
	Apr 6 2018, 7:42 PM

Description

When issues arise with a Cassandra node, it is often most expedient to simply restart it and restore normal operation. However, doing so could destroy valuable information needed to track down the root cause. Since it is not realistic to assume that everyone responding to a alert will know what to look for, we should create a script to automate collecting and archiving relevant data for later examination.

Some ideas:

Heap dumps
Stack dump (or capture)
Logs (debug)
nodetool (if possible)
- status
- gcstats
- compationthroughput
- streamthroughput
- gossipinfo
- proxyhistograms
- toppartitions
- tpstats

Event Timeline

Eevans created this task.Apr 6 2018, 7:42 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 6 2018, 7:42 PM

Eevans triaged this task as Low priority.Apr 6 2018, 7:43 PM

• mobrovac awarded a token.Apr 6 2018, 8:52 PM

Eevans updated the task description. (Show Details)Apr 13 2018, 4:27 PM

• mobrovac added a project: Platform Team Legacy (Later).Dec 20 2018, 12:17 PM

Removing task assignee due to inactivity, as this open task has been assigned to the same person for more than two years (see the emails sent to the task assignee on Oct27 and Nov23). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.
(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)

Eevans raised the priority of this task from Low to Medium.Jun 7 2021, 8:06 PM

Eevans removed a project: User-Eevans.Jun 9 2021, 4:44 PM

Script to collect forensic data from Cassandra hostsOpen, MediumPublicActions

Description

Event Timeline

Script to collect forensic data from Cassandra hosts
Open, MediumPublic
Actions