Page MenuHomePhabricator

Cassandra OOMs on restbase2009-c, and restbase2010-a
Closed, ResolvedPublic

Description

At ~22:30 UTC on 2017-02-16, two Cassandra instances (restbase2009-c and restbase2010-a) experienced OOM of exceptions

$ for i in 2009 2010; do echo "$i: "; ssh restbase$i.codfw.wmnet -- "sudo find /srv/cassandra-* -maxdepth 1 -name '*.hprof' -exec ls -lh {} \;"; done
2009: 
-rw------- 1 cassandra cassandra 4.7G Feb 16 22:35 /srv/cassandra-c/java_pid7023.hprof
2010: 
-rw------- 1 cassandra cassandra 5.0G Feb 16 22:35 /srv/cassandra-a/java_pid120525.hprof
NOTE: It's reasonable to assume that this is a continuation of T144431: RESTBase k-r-v as Cassandra anti-pattern; This ticket was only opened to document the event, a detailed analysis is probably not worth the time
ACTION: Cleanup these heap dumps prior to closing this issue.

Event Timeline

Eevans added projects: Services, Cassandra.
Eevans updated the task description. (Show Details)