Follow up from https://gerrit.wikimedia.org/r/c/operations/dns/+/808198
Looking at making Netbox frontends active/active between eqiad and codfw.
While Netbox used to use the central Redis instance (see doc), it got moved to a local one with rOPUP461ff2f55b37: netbox: Adjust settings for supporting Netbox 2.9 series
as newer redis features are required and the redis servers previously depended on are not of a sufficiently new version.
Looking at Netbox's doc:
NetBox v2.9.0 and later require Redis v4.0 or higher.
While now rdb misc uses:
Redis server v=6.0.14
So we might be able to use a central Redis server again.
@akosiaris are there any reasons on the RDB side that would prevent us from using it for Netbox? For example the latency between eqiad and codfw? Or maybe it's just not made for that.
Some additional thoughts:
- Allowing active/active should improve performances and ensure that all the Netbox frontends are healthy by seeing some traffic
- One unknown is if it will degrade performances on the frontend not local to the DB primary.
- It's better to use a centrally managed cluster as it benefits from a team's expertise, prevents from re-inventing the wheel and scales better. The alternative, a new cluster between the Netbox frontends Redis will make day to day management as well as upgrades more complex
- I don't think there is a risk of circular dependency (Redis broken and the only way to fix it is through Netbox, which requires Redis). But if it's a real risk we could document a way to switch back to a local instance (eg. Hiera config knob) in an emergency