At shortly after 13:00 on 2017-01-31, two Cassandra instances OOM'd on restbase2004-b and restbase2005-b.
$ cdsh -d codfw -- "sudo find /srv/cassandra-* -maxdepth 1 -name '*.hprof'" restbase2004.codfw.wmnet: /srv/cassandra-b/java_pid16503.hprof restbase2005.codfw.wmnet: /srv/cassandra-b/java_pid27082.hprof $
It's reasonably to assume this is no different than the other recent events (see T153588 and T156155), and so it probably doesn't warrant further investigation (this ticket is primary intended to document/acknowledge the occurrence). I will leave this issue open (and the heap dumps in place) for a few days in case anyone else has questions (or if some spare cycles become available).
NOTE: Icinga registered an alert for 2001-a at ~10:00 UTC as well, but that was an administrative shutdown (due to on-going OpenJDK upgrades).