During the staging upgrade to 2.1.13, the packaging post-install invoked the sysv init script to start the root instance (the one based out of /var/lib/cassandra and /etc/cassandra). On several of the nodes, it actually succeeded. This could be Very Bad if it were to happen in production, particularly if it went unnoticed and the aberrant instance were to bootstrap.
Of the nodes that failed (meaning, where the aberrant instance did not start up)...
restbase2001-test.codfw.wmnet didn't start one due to a missing cassandra.yaml:
eevans@restbase-test2001:~$ bash -x /etc/init.d/cassandra status + DESC=Cassandra + NAME=cassandra + PIDFILE=/var/run/cassandra/cassandra.pid + SCRIPTNAME=/etc/init.d/cassandra + CONFDIR=/etc/cassandra + WAIT_FOR_START=10 + CASSANDRA_HOME=/usr/share/cassandra + FD_LIMIT=100000 + '[' -e /usr/share/cassandra/apache-cassandra.jar ']' + '[' -e /etc/cassandra/cassandra.yaml ']' + exit 0
3 others failed only because the data under /var/lib/cassandra predates the cluster rename from "Test Cluster" to "services-test":
ERROR [main] 2016-02-18 18:04:23,351 CassandraDaemon.java:294 - Fatal exception during initialization org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name Test Cluster != configured name services-test at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:613) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:290) [apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) [apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) [apache-cassandra-2.1.13.jar:2.1.13]
What is not clear to me, is why this hasn't been an issue before.
And obviously, going forward we need a concrete (non-accidental) way of disabling these non-root instances.