Page MenuHomePhabricator

Set up shinken for tools-mail exim paniclog
Open, LowPublic

Description

Having this in normal monitoring is probably more useful than the barrage of e-mails we had.

We can either extend

https://github.com/python-diamond/Diamond/wiki/collectors-EximCollector

or write a simple script to use with

https://github.com/python-diamond/Diamond/wiki/Example-Userscripts

we use

https://github.com/wikimedia/operations-puppet/blob/d28b0e3684fa05c7acaaaa1f7ec7f424f99b5eb5/modules/diamond/manifests/collector.pp

to configure diamond in puppet

Event Timeline

valhallasw raised the priority of this task from to Needs Triage.
valhallasw updated the task description. (Show Details)
valhallasw added a project: Toolforge.
valhallasw added a subscriber: valhallasw.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 22 2015, 7:22 PM
yuvipanda renamed this task from Set up ganglia/icinga for tools-mail exim paniclog to Set up graphite/icinga for tools-mail exim paniclog.Apr 22 2015, 7:27 PM
yuvipanda set Security to None.
valhallasw renamed this task from Set up graphite/icinga for tools-mail exim paniclog to Set up diamond/graphite/shinken for tools-mail exim paniclog.Apr 22 2015, 7:53 PM
valhallasw updated the task description. (Show Details)

Change 206118 had a related patch set uploaded (by Merlijn van Deen):
Extend Exim diamond collector for Tool Labs

https://gerrit.wikimedia.org/r/206118

valhallasw triaged this task as Low priority.Apr 27 2015, 12:30 PM

Change 206118 merged by Yuvipanda:
Extend Exim diamond collector for Tool Labs

https://gerrit.wikimedia.org/r/206118

Change 207043 had a related patch set uploaded (by Merlijn van Deen):
Extend Exim diamond collector for Tool Labs

https://gerrit.wikimedia.org/r/207043

Change 207043 merged by Yuvipanda:
tools: Extended Exim diamond collector for Tool Labs

https://gerrit.wikimedia.org/r/207043

paniclog >= 1, num_frozen warn > 15 (or so? not sure), critical > 50 (it's at 53 right now, and we need to fix that because that means mails aren't going out to the right people)

valhallasw renamed this task from Set up diamond/graphite/shinken for tools-mail exim paniclog to Set up shinken for tools-mail exim paniclog.Jul 2 2015, 7:55 PM
Restricted Application added a project: Cloud-Services. · View Herald TranscriptJul 2 2015, 7:55 PM

Change 378105 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] toollabs: Add shinken check for tools-mail exim queue length

https://gerrit.wikimedia.org/r/378105

Change 378105 merged by Rush:
[operations/puppet@production] toollabs: Add shinken check for tools-mail exim queue length

https://gerrit.wikimedia.org/r/378105

Change 378186 had a related patch set uploaded (by Rush; owner: cpettet):
[operations/puppet@production] toolforge: adjust comment on tools-mail queue check

https://gerrit.wikimedia.org/r/378186

Change 378186 merged by Rush:
[operations/puppet@production] toolforge: adjust comment on tools-mail queue check

https://gerrit.wikimedia.org/r/378186