04:37 < icinga-wm> RECOVERY - puppet last run on restbase2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
11:35 < icinga-wm> RECOVERY - cassandra-b service on restbase2004 is OK: OK - cassandra-b is active
11:41 < icinga-wm> PROBLEM - cassandra-b service on restbase2004 is CRITICAL: CRITICAL - Expecting active but unit cassandra-b is failed
14:06 < icinga-wm> RECOVERY - cassandra-b service on restbase2004 is OK: OK - cassandra-b is active
14:11 < icinga-wm> PROBLEM - cassandra-b service on restbase2004 is CRITICAL: CRITICAL - Expecting active but unit cassandra-b is failed
16:36 < icinga-wm> RECOVERY - cassandra-b service on restbase2004 is OK: OK - cassandra-b is active
16:42 < icinga-wm> PROBLEM - cassandra-b service on restbase2004 is CRITICAL: CRITICAL - Expecting active but unit cassandra-b is failed
17:06 < icinga-wm> RECOVERY - cassandra-b service on restbase2004 is OK: OK - cassandra-b is active
17:10 < icinga-wm> PROBLEM - cassandra-b service on restbase2004 is CRITICAL: CRITICAL - Expecting active but unit cassandra-b is failed
Description
Description
Event Timeline
Comment Actions
We had these messages in channel for many hours, keeps crashing and then coming back? Did nobody get pages or mails?
Comment Actions
[restbase2004:~] $ sudo -s root@restbase2004:~# service cassandra-b status ● cassandra-b.service - distributed storage system for structured data Loaded: loaded (/lib/systemd/system/cassandra-b.service; static) Active: failed (Result: exit-code) since Tue 2016-04-19 01:31:20 UTC; 2min 16s ago Process: 11732 ExecStart=/usr/sbin/cassandra -p /var/run/cassandra/cassandra-b.pid (code=exited, status=3) Main PID: 11732 (code=exited, status=3) Apr 19 01:31:18 restbase2004 cassandra[11544]: at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:794) Apr 19 01:31:18 restbase2004 cassandra[11544]: at org.apache.cassandra.service.StorageService.initServer(StorageService.java:726) Apr 19 01:31:18 restbase2004 cassandra[11544]: at org.apache.cassandra.service.StorageService.initServer(StorageService.java:617) Apr 19 01:31:18 restbase2004 cassandra[11544]: at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:389) Apr 19 01:31:18 restbase2004 cassandra[11544]: at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:564) Apr 19 01:31:18 restbase2004 cassandra[11544]: at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:653) Apr 19 01:31:18 restbase2004 cassandra[11544]: Exception encountered during startup: Other bootstrapping/leaving/moving nodes detected, cannot bootstrap w...t is true Apr 19 01:31:18 restbase2004 cassandra[11544]: WARN 01:31:18 No local state or state is in silent shutdown, not announcing shutdown Apr 19 01:31:20 restbase2004 systemd[1]: cassandra-b.service: main process exited, code=exited, status=3/NOTIMPLEMENTED Apr 19 01:31:20 restbase2004 systemd[1]: Unit cassandra-b.service entered failed state. Hint: Some lines were ellipsized, use -l to show in full. root@restbase2004:~# systemctl cassandra-b start Unknown operation 'cassandra-b'. root@restbase2004:~# systemctl start cassandra-b root@restbase2004:~# systemctl status cassandra-b ● cassandra-b.service - distributed storage system for structured data Loaded: loaded (/lib/systemd/system/cassandra-b.service; static) Active: active (running) since Tue 2016-04-19 01:34:21 UTC; 5s ago Main PID: 13726 (java) CGroup: /system.slice/cassandra-b.service └─13726 java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -...
Comment Actions
This node should not be running, it is administratively down; I'm not sure what happened that it started to send notifications now.
Comment Actions
I think this must have just been an Icinga snafu, the over optimistic use of an expiring acknowledgement, or somesuch. The instance in question is administratively down until after new hardware/capacity is added to rack 'c' in codfw, and seems to have a persistent acknowledgement now.
I'm resolving; Feel free to reopen if I missed something.
Comment Actions
Fine with me, but i fail to see how "cassandra[11544]: Exception encountered during startup: " can be an icinga snafu.