Page MenuHomePhabricator

Cassandra exceptions related to range delete
Closed, ResolvedPublic

Description

As part of T164865: Prototype and test range delete-based current revision storage, range delete-based culling of past renders was merged to the dev environment on Thursday (2017-05-18). Afterward, a number of exceptions were observed, one on the write path, and one on the read (see pastes below). These did not occur on every write, and the frequency seemed suspiciously close to the combined probability of both a revision and render range delete occurring for the same write, son Friday (2017-05-19) revision culling was disabled, and the exceptions ceased.

https://issues.apache.org/jira/browse/CASSANDRA-13544

It may be possible in the meantime to alter the retention algorithm to either cull revisions or renders, but never both on the same write.

1ERROR [SharedPool-Worker-18] 2017-05-19 17:30:22,426 Message.java:611 - Unexpected exception during request; channel = [id: 0x793a853b, L:/10.64.0.36:9042 - R:/10.64.32.112:550
248]
3java.lang.AssertionError: null
4 at org.apache.cassandra.db.ClusteringBoundOrBoundary.<init>(ClusteringBoundOrBoundary.java:31) ~[apache-cassandra-3.7.3.jar:3.7.3]
5 at org.apache.cassandra.db.ClusteringBoundary.<init>(ClusteringBoundary.java:15) ~[apache-cassandra-3.7.3.jar:3.7.3]
6 at org.apache.cassandra.db.ClusteringBoundOrBoundary.inclusiveCloseExclusiveOpen(ClusteringBoundOrBoundary.java:78) ~[apache-cassandra-3.7.3.jar:3.7.3]
7 at org.apache.cassandra.db.rows.RangeTombstoneBoundaryMarker.inclusiveCloseExclusiveOpen(RangeTombstoneBoundaryMarker.java:54) ~[apache-cassandra-3.7.3.jar:3.7.3]
8 at org.apache.cassandra.db.rows.RangeTombstoneMarker$Merger.merge(RangeTombstoneMarker.java:139) ~[apache-cassandra-3.7.3.jar:3.7.3]
9 at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:521) ~[apache-cassandra-3.7.3.jar:3.7.3]
10 at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:478) ~[apache-cassandra-3.7.3.jar:3.7.3]
11 at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) ~[apache-cassandra-3.7.3.jar:3.7.3]
12 at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) ~[apache-cassandra-3.7.3.jar:3.7.3]
13 at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.7.3.jar:3.7.3]
14 at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:460) ~[apache-cassandra-3.7.3.jar:3.7.3]
15 at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:320) ~[apache-cassandra-3.7.3.jar:3.7.3]
16 at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.7.3.jar:3.7.3]
17 at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:113) ~[apache-cassandra-3.7.3.jar:3.7.3]
18 at org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:30) ~[apache-cassandra-3.7.3.jar:3.7.3]
19 at org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:53) ~[apache-cassandra-3.7.3.jar:3.7.3]
20 at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:23) ~[apache-cassandra-3.7.3.jar:3.7.3]
21 at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:6) ~[apache-cassandra-3.7.3.jar:3.7.3]
22 at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76) ~[apache-cassandra-3.7.3.jar:3.7.3]
23 at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:735) ~[apache-cassandra-3.7.3.jar:3.7.3]
24 at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:410) ~[apache-cassandra-3.7.3.jar:3.7.3]
25 at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:363) ~[apache-cassandra-3.7.3.jar:3.7.3]
26 at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:237) ~[apache-cassandra-3.7.3.jar:3.7.3]
27 at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:78) ~[apache-cassandra-3.7.3.jar:3.7.3]
28 at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) ~[apache-cassandra-3.7.3.jar:3.7.3]
29 at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:486) ~[apache-cassandra-3.7.3.jar:3.7.3]
30 at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:463) ~[apache-cassandra-3.7.3.jar:3.7.3]
31 at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) ~[apache-cassandra-3.7.3.jar:3.7.3]
32 at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) [apache-cassandra-3.7.3.jar:3.7.3]
33 at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) [apache-cassandra-3.7.3.jar:3.7.3]
34 at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.36.Final.jar:4.0.36.Final]
35 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292) [netty-all-4.0.36.Final.jar:4.0.36.Final]
36 at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:32) [netty-all-4.0.36.Final.jar:4.0.36.Final]
37 at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:283) [netty-all-4.0.36.Final.jar:4.0.36.Final]
38 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_131]
39 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.3.jar:3.7.3]
40 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.7.3.jar:3.7.3]
41 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]

1WARN [SharedPool-Worker-68] 2017-05-19 17:54:03,267 AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-68,10,main]: {}
2java.lang.AssertionError: null
3 at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:536) ~[apache-cassandra-3.7.3.jar:3.7.3]
4 at org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:217) ~[apache-cassandra-3.7.3.jar:3.7.3]
5 at org.apache.cassandra.db.MutableDeletionInfo.add(MutableDeletionInfo.java:141) ~[apache-cassandra-3.7.3.jar:3.7.3]
6 at org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:143) ~[apache-cassandra-3.7.3.jar:3.7.3]
7 at org.apache.cassandra.db.Memtable.put(Memtable.java:250) ~[apache-cassandra-3.7.3.jar:3.7.3]
8 at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1324) ~[apache-cassandra-3.7.3.jar:3.7.3]
9 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:529) ~[apache-cassandra-3.7.3.jar:3.7.3]
10 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:399) ~[apache-cassandra-3.7.3.jar:3.7.3]
11 at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:202) ~[apache-cassandra-3.7.3.jar:3.7.3]
12 at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:207) ~[apache-cassandra-3.7.3.jar:3.7.3]
13 at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:69) ~[apache-cassandra-3.7.3.jar:3.7.3]
14 at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-3.7.3.jar:3.7.3]
15 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
16 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.7.3.jar:3.7.3]
17 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.7.3.jar:3.7.3]
18 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.7.3.jar:3.7.3]
19 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]

Event Timeline

This behavior is confirmed in Cassandra 3.10 as well:

1WARN [MutationStage-5] 2017-05-25 19:30:41,605 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[MutationStage-5,5,main]: {}
2java.lang.AssertionError: null
3 at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:536) ~[apache-cassandra-3.10.jar:3.10]
4 at org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:217) ~[apache-cassandra-3.10.jar:3.10]
5 at org.apache.cassandra.db.MutableDeletionInfo.add(MutableDeletionInfo.java:141) ~[apache-cassandra-3.10.jar:3.10]
6 at org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:143) ~[apache-cassandra-3.10.jar:3.10]
7 at org.apache.cassandra.db.Memtable.put(Memtable.java:284) ~[apache-cassandra-3.10.jar:3.10]
8 at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1316) ~[apache-cassandra-3.10.jar:3.10]
9 at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:618) ~[apache-cassandra-3.10.jar:3.10]
10 at org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:425) ~[apache-cassandra-3.10.jar:3.10]
11 at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:222) ~[apache-cassandra-3.10.jar:3.10]
12 at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:68) ~[apache-cassandra-3.10.jar:3.10]
13 at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-3.10.jar:3.10]
14 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_131]
15 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.10.jar:3.10]
16 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.10.jar:3.10]
17 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.10.jar:3.10]
18 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]

It may be possible in the meantime to alter the retention algorithm to either cull revisions or renders, but never both on the same write.

Perhaps we should ensure this regardless of Cassandra's behaviour? It seems wasteful to be issuing a range delete for renders if revisions are being dropped anyway.

It may be possible in the meantime to alter the retention algorithm to either cull revisions or renders, but never both on the same write.

FWIW, I don't think this was a useful suggestion. If the problem manifests under concurrency, this would make it less likely but not impossible; I'm almost certain it would still be possible for concurrent restbase requests to trigger the issue.

Perhaps we should ensure this regardless of Cassandra's behaviour? It seems wasteful to be issuing a range delete for renders if revisions are being dropped anyway.

I guess that you mean it would be wasteful to issue a delete of renders where revision = N, while also deleting N, is that correct? That wouldn't happen, I don't think; What we'd have is a delete of renders where revision = N and a delete of revisions < N.

I am not able to reproduce these exceptions using a snapshot of the 3.11 branch.

Eevans edited projects, added Services (done); removed Services (doing).

With Cassandra 3.11.0 about to be released, I think we can safely close this issue.