CapEx time is upon us, this task will track the following:
- Capacity estimation/planning for Thanos needs in terms of object storage, also in light of T351927
- Estimation of thanos compact space needs, which brings titan disk space utilization close to maximum at the moment, so we'll likely need to add some capacity there too
Object storage requirement estimation
Thanos data is written by Prometheus in the form of raw datapoints blocks to object storage, each block represents a few hours of data. The raw data blocks are then downsampled to lower resolutions (5m, 1h) and written back to storage. All hours-long blocks (all resolutions) are also compacted into 14 days blocks for space savings. To each resolution we then apply a retention policy to delete older blocks.
Due to storage space pressure, in T351927 we have implemented additional logic to block cleanup: namely to take into account the fact that Prometheus in eqiad and codfw is replicated (two Prometheus hosts per each site), thus we also have blocks of very similar data which can be deleted if need be. We did do the deletion for blocks older than 3 months, hence I'll be considering blocks newer than that below to not account for the extra deletion.
Default retention strategy
This is the easiest and the strategy implemented by Thanos: we only delete blocks when they are too old (i.e. past their retention period).
For the last ~two months we get the following usage:
# days | GBs | resolution | GB/day |
---|---|---|---|
76 | 11143 | 0s (raw) | 146 |
73 | 8433 | 5m | 115 |
71 | 1595 | 1h | 22 |
Extrapolating from that we get:
Current retention
This is the retention policy we have configured in Puppet as of today.
# weeks | GBs | resolution |
---|---|---|
54 | 55188 | 0s |
270 | 217350 | 5m |
270 | 41580 | 1h |
Yielding a grand total of ~314TB needed. Thanos storage size is ~130 TB total, meaning we'd need to more than double the capacity (!) not a great situation.
Proposed retention and hardware needs
As a reasonable compromise I think we can do the following: keep 0s and 5m data for slightly longer than a year (so year over year comparisons are possible) so about 60w, and have 1h data for longer since it is significantly less expensive to keep. In other words (rounding up numbers)
# weeks | GBs | resolution |
---|---|---|
60 | ~62000 | 0s |
60 | ~50000 | 5m |
280 | ~43000 | 1h |
Or ~155TB total, meaning we need to add about 30-40TB to current Thanos storage.
In terms of hardware this translates in an additional two hosts of the 24x 8TB class which will provide plenty of headroom (an additional ~100TB usable). We could also probably get away with two hosts of the 12x 4TB class (i.e. what thanos-be is now) though that wouldn't provide very much headroom.
Titan hosts storage
The titan hosts run block compaction processes described above, and require temporary space to write the compacted blocks to disk before upload. The hosts have been managing though they occasionally get tight on disk space, for this reason we should procure additional SSD to install on these to get ahead of the curve.
Hardware needs
We'll need 2x SSD per host (across 4x hosts) so total 8x SSD of 500GB capacity or greater to install in already exists hosts.