We have identified a need for a new storage platform in order to meet the needs of teams focused on building data-centric products and features.
The platform should offer both block storage and object storage capabilities and should be suitable for use with both analytics and production oriented workloads.
As such we are planning to build an MVP of such a platform in Q2 of the 2022/2023 financial year, with the Data-Engineering team taking primary responsibility for its design and implementation. Consultation and close collaboration with the Data Persistence and Infrastructure Foundations teams will be essential in order to ensure that the deisgn meets the requirements and that the expected traffic profile is compatible with our network topology.
The end goal is to build a scalable platform that can facilitate self-service data infrastructure provision across many teams and for a wide variety of requirements. The MVP should be designed in such a way that once its value has been proven it can be promoted to a production class service without a full rebuild.
Some key use cases include
- the ability to support Persistent Volume Claims in Kubernetes, such that we can begin to deploy stateful services on k8s
- the ability to provide block storage to virtual machines, for enhanced flexibility in designing data processing systems on VMs
- the ability to provide S3 and/or Swift compatible object storage as a back-end for analytics and similar workloads
In this phase we are only looking at building this platform in eqiad, although we should always consider how it would scale to a multi-DC and/or cross-DC design.