Page MenuHomePhabricator

Migrate tools-checker system to Stretch
Closed, ResolvedPublic

Description

Migrate the toolschecker monitoring system to Debian Stretch instances in the the eqiad1-r region with the Stretch job grid as the target for any grid engine monitoring.

Details

Related Gerrit Patches:
operations/puppet : productiontoolschecker: Typo fix
operations/puppet : productionwmcs: Migrate tools-checker to Stretch

Event Timeline

bd808 triaged this task as High priority.Mar 25 2019, 11:28 PM
bd808 created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 25 2019, 11:28 PM
Bstorm added a subscriber: Bstorm.Mar 28 2019, 10:43 PM

Apparently the check for the wikilabels DB is functioning but not added here yet:
https://github.com/wikimedia/puppet/blob/production/modules/icinga/manifests/monitor/toollabs.pp

It does currently work http://checker.tools.wmflabs.org/labsdb/wikilabelsrw

Just adding it to the pile.

Change 500095 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] wmcs: Migrate tools-checker to Stretch

https://gerrit.wikimedia.org/r/500095

Mentioned in SAL (#wikimedia-cloud) [2019-03-29T20:22:38Z] <bd808> Disabled puppet on tools-checker-0{1,2} to make testing new role::wmcs::toolforge::checker easier (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-03-29T20:24:29Z] <bd808> Cherry-picked https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/500095/ to tools-puppetmaster-01 for testing (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-03-29T20:32:36Z] <bd808> Creating tools-checker-03 with role::wmcs::toolforge::checker (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-03-29T21:08:58Z] <bd808> Updated cherry-pick of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/500095/ on tools-puppetmaster-01 (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-04-01T19:43:16Z] <bd808> Shutdown tools-checker-02 via Horizon (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-04-01T19:44:03Z] <bd808> Deleted tools-checker-02 via Horizon (T219243)

Mentioned in SAL (#wikimedia-cloud) [2019-04-02T03:54:58Z] <bd808> Added etcd service group to tools-k8s-etcd-* (T219243)

Mentioned in SAL (#wikimedia-operations) [2019-04-02T12:11:42Z] <arturo> icinga downtime toolschecker for 1 month T219243

Mentioned in SAL (#wikimedia-cloud) [2019-04-02T12:11:56Z] <arturo> icinga downtime toolschecker for 1 month T219243

Change 500095 merged by Andrew Bogott:
[operations/puppet@production] wmcs: Migrate tools-checker to Stretch

https://gerrit.wikimedia.org/r/500095

Change 501034 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] toolschecker: Typo fix

https://gerrit.wikimedia.org/r/501034

Change 501034 merged by Bstorm:
[operations/puppet@production] toolschecker: Typo fix

https://gerrit.wikimedia.org/r/501034

bd808 closed this task as Resolved.Apr 5 2019, 2:10 AM