Page MenuHomePhabricator

Cassandra node outages; OutOfMemoryError exceptions
Closed, ResolvedPublic

Description

Earlier today, restbase1011-b.eqiad.wmnet went down from an OutOfMemoryError exception (down @ ~18:03 UTC, up @ ~18:13 UTC after being started by Puppet).

java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) ~[na:1.8.0_111]
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_111]
        at org.apache.cassandra.io.compress.BufferType$1.allocate(BufferType.java:28) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.allocateBuffer(CompressedRandomAccessReader.java:86) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:62) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:70) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:49) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createPooledReader(CompressedPoolingSegmentedFile.java:124) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:63) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1757) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.sstable.format.big.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:381) ~[apache-cassandra2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.sstable.format.big.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:351) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.sstable.format.big.IndexedSliceReader.computeNext(IndexedSliceReader.java:143) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.io.sstable.format.big.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) ~[guava-16.0.jar:na]
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) ~[guava-16.0.jar:na]
        at org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:83) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:174) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:157) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:89) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:48) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:105) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:319) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2025) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1829) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) ~[apache-cassandra-2.2.6.jar:2.2.6]
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:38) ~[apache-cassandra-2.2.6.jar:2.2.6]

Event Timeline

Eevans updated the task description. (Show Details)

All signs point to the on-going issues being worked elsewhere, and at 4.2G, the heap dump is likely of no use in analyzing what happened, so I have removed the file and am closing this issue.