Looking at the first production data it looks like our current RESTBase Cassandra cluster does not have a lot of margin on IO bandwidth and storage capacity. With more tuning we might be able to stretch it a bit more, but it is clear that we'll need more capacity sooner rather than later.
We are leaning towards moving to a setup with multiple Cassandra instances per hardware node, aiming for a maximum compressed load of ~600G per instance. With 1T SSDs this could be configured as:
- one instance per physical SSD, JBOD
- all instances sharing a RAID-0 (current setup)
- some other RAID level
JBOD would give us a good amount of failure isolation, but also does not allow us to share the aggregate IO bandwidth between instances. RAID levels other than RAID-0 lose disk capacity, and are not strictly needed as the data is already replicated 3-way across the cluster.
For the current cluster, the main decision we have to make is whether we want to add one SSD to the existing nodes or not. (There is one slot left.)
See T97692 for the storage need projection for FY2015/16. Bottom line is: ~35T of additional storage in the next fiscal year, for a total of 53T per cluster.
From a replica placement perspective it makes sense to add nodes in increments of three, one per each of three rows.
RT tracking of the current order: https://rt.wikimedia.org/Ticket/Display.html?id=9506