Etcd is pushing progressively to remove the version 2 datastore. We will eventually need to switch conftool to etcd v3, which should allow us to use the native replication as well.
Sadly the transition would require a lot of work, that I will try to summarize here. There's also two main ways to go with the integration: native grpc or via the grpc/json gateway.
Native grpc interface
- Allow access to the etcdv3 grpc interface via nginx.
- Fork/take over maintenance of the grpc python3 client currently at https://github.com/kragniz/python-etcd3 which looks notably unmaintained at the moment
- Verify auth works
GRPC gateway
- Check that access via nginx works as intended
- Create a python client for the v3 http api. It doesn't need to be complete, we just need the CRUD parts
- verify auth works
We then need to integrate etcd 3 into conftool, allowing to write to both datastores:
- Add an etcd3 backend to conftool
- Add a "proxy" backend to conftool that can write to multiple backends
- Start writing to both backends
And finally, we'll need to convert clients:
- Confd supports etcdv3 natively; we'll still have to find out if it's 100% compatible
- MediaWiki will probably need to be enabled to call the v3 grpc gateway
- We might want to migrate pybal, and in that case I'd still use the v3 grpc gateway
Status update (June 2025)
Preparatory work has started in advance of introducing an etcd v3 backend driver to conftool. Specifically, this involves:
- Document API semantics of the existing v2 driver
- Implement conformance tests to assert those semantics are honored (to which the v3 driver will also be subject)
- Resolve under-specified or inconsistent behaviors in the v2 driver (in progress)
With that complete, we'll be ready to introduce a v3 driver implementing the same API, and later a (temporary) dual-write driver to support the migration (i.e., sequencing writes across v2 and v3 dependent on migration phase). We have not yet made a final decision on the specific v3 python client to use. One option under consideration is something like etcd3-client-lite, which targets the HTTP gRPC gateway.
I'll soon open a separate subtask to specifically capture discussion on tradeoffs between various client options.
More generally, there already exists a detailed migration design drafted in mid-2024. I'll aim to start incrementally refreshing that and breaking out other key decisions to subtasks (covering, e.g., auth model, MediaWiki support, migration phases, etc.).