Page MenuHomePhabricator

Migrate tools-checker system to Stretch
Closed, ResolvedPublic

Description

Migrate the toolschecker monitoring system to Debian Stretch instances in the the eqiad1-r region with the Stretch job grid as the target for any grid engine monitoring.

Event Timeline

bd808 triaged this task as High priority.Mar 25 2019, 11:28 PM
bd808 created this task.

Apparently the check for the wikilabels DB is functioning but not added here yet:
https://github.com/wikimedia/puppet/blob/production/modules/icinga/manifests/monitor/toollabs.pp

It does currently work http://checker.tools.wmflabs.org/labsdb/wikilabelsrw

Just adding it to the pile.

Change 500095 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] wmcs: Migrate tools-checker to Stretch

https://gerrit.wikimedia.org/r/500095

Mentioned in SAL (#wikimedia-cloud) [2019-03-29T20:22:38Z] <bd808> Disabled puppet on tools-checker-0{1,2} to make testing new role::wmcs::toolforge::checker easier (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-03-29T20:32:36Z] <bd808> Creating tools-checker-03 with role::wmcs::toolforge::checker (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-04-01T19:43:16Z] <bd808> Shutdown tools-checker-02 via Horizon (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-04-01T19:44:03Z] <bd808> Deleted tools-checker-02 via Horizon (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-04-02T03:54:58Z] <bd808> Added etcd service group to tools-k8s-etcd-* (T219243)

Mentioned in SAL (#wikimedia-operations) [2019-04-02T12:11:42Z] <arturo> icinga downtime toolschecker for 1 month T219243

Mentioned in SAL (#wikimedia-cloud) [2019-04-02T12:11:56Z] <arturo> icinga downtime toolschecker for 1 month T219243

Change 500095 merged by Andrew Bogott:
[operations/puppet@production] wmcs: Migrate tools-checker to Stretch

https://gerrit.wikimedia.org/r/500095

Change 501034 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] toolschecker: Typo fix

https://gerrit.wikimedia.org/r/501034

Change 501034 merged by Bstorm:
[operations/puppet@production] toolschecker: Typo fix

https://gerrit.wikimedia.org/r/501034