==== Current status
- an-coord1001 runs the '[[ https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Mysql_Meta | analytics meta ]]' MariaDB master instance. This instance has [[ https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Mysql_Meta#Databases | several databases ]] for Analytics Cluster operations.
- an-coord1002 runs a standby replica of this instance, but in the case of a failure, switching to an-coord1002 is a error prone and manual process.
- matomo1002 runs a MariaDB instance for the 'piwik' database.
- db1108 runs backup replicas of analytics-meta and matamo MariaDB instances, and backula is used to keep historical backups.
- As described in {T279440}, the replicas do not match the masters.
- Relevant MariaDB configs do not necessarily match between masters and replicas.
==== Desired status
(as described in T280905)
- Dedicated DB hardware to be ordered in Q1 FY2021-2022 to replace an-coord100[12]: an-db100[12].
- All DBs moved to analytics-meta master running from an-db1001, including matomo. (This will allow us to remove the MariaDB mastter instance on matomo1002, and remove the extra backup replica instance on db1108).
- Instance configs DRYed and standardized so we don't end up with misconfiguration problems like {T279440} again.
- an-db1002 and db1108 fully recreated from master snapshot.
---
Originally, this ticket was setting up multi master instances and being able to do failover for individual MariaDB database instances. However, it was discovered that Data Persistence does not really support MariaDB multi instance master setups, and the reasons for us doing so aren't really that useful. Most of the time, failovers will be manual and done for hardware reasons, meaning all DBs would have to be failed over anyway. Having many master setups means more replicas and binlogs to manage, which makes maintenance like that harder, not easier. Ideally each app's DB would be totally isolated from the others, but we will have to wait until perhaps one day we get persitent volumes in k8s to do this really properly.
For now we are going with a single analytics-meta instance for all databases.