This task is for setting up a new storage cluster. The expectation is that this will take on some of the storage use cases currently served by the thanos and ms swift clusters. It will demonstrate Ceph's multi-site capability, providing a single S3 end-point that is then replicated between two storage clusters, one per DC. It will follow the inexpensive model of the existing swift clusters of having the bulk storage being on HDDs.
Some existing uses of swift are tracked at T264291: Swift users and their usage with more details.
- Evaluate whether we need encrypted backend traffic across datacenters for the cluster (likely ipsec)
- Decide on initial storage policies (replication factor, ssd/hdd, site-local vs global, which should be default, etc)
- Bring frontends online: T275513 T275511
- Bring backends online: T276642 T276637
- Bring up service IPs / LVS and certs
- Bring up dashboards/monitoring/alerting
Once the service/cluster is up we can start migrating users / use cases (in a different task, TBD)