Page MenuHomePhabricator

Mjolnir query daemon should monitor for active cluster and only query the standby cluster
Closed, ResolvedPublic

Description

This query daemon collects training data for mjolnir from the elasticsearch cluster that is on warm standby. Currently mjolnir attempts to verify the cluster is idle by asking the cluster directly about load, but this is fragile. Additionally moving to close the firewall hole between analytics and prod elasticsearch will make this even more difficult.

Ideally the analytics side shouldn't know or care about what cluster is active. We could instead have the daemons monitor some authoritative source of information (etcd?) about what cluster is active. If that changes from, for example, eqiad to codfw, the daemons in codfw would stop consuming. The daemons in eqiad would monitor cluster apis waiting for an idle state, and then start consuming.