Page MenuHomePhabricator

Investigate high-availability and managed failover mechanisms for the analytics_meta MariaDB instances
Open, MediumPublic

Description

We currently have MariaDB running in support of the Analytics Meta services.

These databases support the following subsystems:

  • Hive metastore
  • DataHub
  • Druid (both analytics and public clusters)
  • Superset (both production and next instances)
  • Hue (soon to be deprecated)

These databases are running on an-mariadb100[1-2] and are currently running MariaDB 10.4.
Under normal conditions, an-mariadb1002 is a replica of an-mariadb1001.
The databases are backed up to db1208 and from there to bacula.

We do not currently have a mechanism in place that would allow us to swap these roles easily. Such a mechanism would facilitate routine operational work on an-mariadb1001.

Acceptance Criteria
  • Identify one or more options for how to improve this high-availability and managed failover for Analytics Meta, so that we might decide whether to proceed to an implementation.