We should enable paging for multi-instance slaves as right now they only alert on IRC.
We had one case of a multi-instance slave being lagged and given how the LB does (not) work (T180918) it caused an outage on wikidata (T198049)
Pages should come if:
- Number of processes is smaller than the one defined in hiera
- Replication is broken
- Replication is lagging
But only under the following conditions:
- We are in active-passive setup:
- Core single and multi-instance should page for replication status and lag only on the primary datacenter
- It should just warn on irc for the passive datacenters
- We are in active-active setup:
- it should page for replication status and lag on all active datacenters.