Page MenuHomePhabricator

2024-2025 ms swift capacity
Open, Needs TriagePublic

Description

In the 12 months to date, codfw ms swift has grown from 728 TiB to 883 TiB (dashboard), increase of 155 TiB
In that time, eqiad ms swift has grown from 755 TiB (that's 2023-03-12) to 915 TiB (dashboard), increase of 160 TiB.

We have 15 months to the end of the 2024-2025 FY, so a very approximate growth expectation might be 200 TiB. So that would leave eqiad using (1115 * 3)=3345 TiB of raw capacity or about 75% capacity. Which I think means we can get away without any ms swift expansion in the upcoming fiscal year, but we would be expecting to want to expand in the following FY.

The alternative argument would be: if we buy no hardware in the 2024-2025 FY, by the end of the 2025-2026 FY (being pessimistic about how long buying hardware might take) we'd be in the region of 3945 TiB, which would be 89% of capacity which is too high, so maybe we should get more hardware now. If we were running that line of argument, then a new swift backend server is (24*8)=192 TB (174TiB), so one per DC would cover us for about 1 year's expected expansion.

Event Timeline

Additionally, we are retiring the last 9 12x4 T nodes from eqiad and the last 6 12x4T nodes from codfw and replacing them with 24x8T units.

After that refresh but ignoring the proposed 1 24x8T upgrade to both clusters, we will be left with 32 24x8T servers in eqiad and 30 24x8T servers in codfw, which is a significant difference (previously eqiad had 3 12x4T more, but codfw had 1 24x8T more which roughly cancelled out).

So we should additionally expand codfw by 2 further 24x8T systems to result in equal capacity in both MS clusters.