Page MenuHomePhabricator

Add replication lag (and other checks) to misc all hosts
Closed, ResolvedPublic

Description

Right now, only multi-instance misc hosts have the replication lag check.
Review other replication related check and add them to all misc hosts, as they should exist on either multi-instance or non multi-instance hosts.
Check also the read-only checks are in place.
The replication checks should be more flexible than in production (e.g. larger lag buffer) to avoid alert spam.

Example:
db1117 does have them
db1135 or db2062 do not have them
db2065 has them

Event Timeline

Marostegui triaged this task as Medium priority.Nov 11 2019, 9:14 AM
Marostegui moved this task from Triage to Backlog on the DBA board.

Change 594905 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Enable read_only monitoring on misc hosts

https://gerrit.wikimedia.org/r/594905

Change 594905 merged by Jcrespo:
[operations/puppet@production] mariadb: Enable read_only monitoring on misc hosts

https://gerrit.wikimedia.org/r/594905

Change 595143 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add read_only monitoring to other misc dbs: tendril, phab, event

https://gerrit.wikimedia.org/r/595143

Change 595143 merged by Jcrespo:
[operations/puppet@production] mariadb: Add read_only monitoring to other misc dbs: tendril, phab, event

https://gerrit.wikimedia.org/r/595143

Change 595145 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add replication monitoring to misc hosts

https://gerrit.wikimedia.org/r/595145

Change 595145 merged by Jcrespo:
[operations/puppet@production] mariadb: Add replication monitoring to misc hosts

https://gerrit.wikimedia.org/r/595145

Change 595149 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] monitoring: remove usages of 'dba' contact group

https://gerrit.wikimedia.org/r/595149

Change 595149 merged by Jcrespo:
[operations/puppet@production] monitoring: remove usages of 'dba' contact group

https://gerrit.wikimedia.org/r/595149

jcrespo claimed this task.

This is now fixed.