Create a cookbook to restart the jvms on a Cassandra cluster
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	elukey
	Aug 7 2019, 1:19 PM

Description

We have several clusters of Cassandra in production, and once in a while we need to roll restart all their jvms for security upgrades. Ideally this could be done by a cookbook rather than manually.

What I usually do for the AQS cluster is (two Cassandra instances for each of the 6 nodes):

select one host
check nodetool-a and nodetool-b, they should return a list of 12 IPs with UN state each (without any errors for say instance bootstrapping or down)
nodetool-a drain + systemctl restart cassandra-a and nodetool-b drain` + systemctl restart cassandra-b
wait until nodetool-a and nodetool-b return 12 IPs with UN state
proceed with the next host

A couple of notes:

nodetool drain is probably not needed, but it seems a good step to add anyway.
4) in theory could be simplified in something like "wait 5 minutes, run nodetool-a status | egrep '^DN' | wc -l and check that it is 12, fail otherwise". But the sleep time depends of course from the cluster's data and should be configurable (with a sane default).

Suggestions are welcome!

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T203943 Spicerack cookbooks TODO list
		Resolved		elukey	T230022 Create a cookbook to restart the jvms on a Cassandra cluster

Event Timeline

elukey created this task.Aug 7 2019, 1:19 PM

@jbond added that a fews days ago in https://gerrit.wikimedia.org/r/#/c/operations/cookbooks/+/528133/ :-)

Really nice! AQS is not supported and I wasn't aware :P

elukey closed this task as Resolved.Aug 7 2019, 1:25 PM

I supports single instance Cassandra clusters as well (for maps), so all it should take is to add "aqs" to the list of clusters

joanna_borun added a project: Spicerack.Jun 15 2022, 10:48 AM

Restricted Application added a project: Infrastructure-Foundations. · View Herald TranscriptJun 15 2022, 10:48 AM

Create a cookbook to restart the jvms on a Cassandra clusterClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Create a cookbook to restart the jvms on a Cassandra cluster
Closed, ResolvedPublic
Actions

Related Objects
Search...