I have not thought about whether this is hard or easy. But ideally, puppet could punish project admins for neglected instances -- this might encourage people to fix or delete them.
Description
Details
Related Objects
- Mentioned In
- T210432: puppet_alert.py not working
Event Timeline
Change 262856 had a related patch set uploaded (by Andrew Bogott):
WIP: Send email to project admins when puppet fails.
Change 262856 merged by Andrew Bogott:
Send email to project admins if puppet runs are failing.
OK... in projects 'testlabs' and 'puppet' I can do this:
echo "this is text" | mail -s 'subject line' andrewbogott@gmail.com
and it sends me an email. That doesn't work on an instance in 'tools' though. Merlijn predicted that I would have this problem :(
Why does it work in some projects and not others? Note that if it's /only/ tools that prevents emails, that's just fine since the Tools people are getting shinken notifications about puppet anyway. I just want to confirm that nags will get sent from other projects.
This works for me in an interactive session on tools-bastion-01. For grid jobs, @Anomie found out that this can fail under some circumstances (cf. https://wikitech.wikimedia.org/wiki/Help:Tool_Labs#Mail_from_tools) and you have to use /usr/sbin/exim -odf -i instead. Could you have encountered a similar problem?
The messages seem to be queued on the test host (tools-puppet-is-broken-here-on-purpose.tools.eqiad.wmflabs), and are not actually being sent out:
2016-01-16 06:50:41 1aJzIg-0002oE-Aa no IP address found for host polonium.wikimedia.org 2016-01-16 06:50:41 1aJzIg-0002oE-Aa == root@wmflabs.org R=smart_route defer (-1): lookup of host "polonium.wikimedia.org" failed in smart_route router
and there's 60 emails to andrewbogott@gmail.com waiting in /var/spool/exim4/input.
I'm not sure why the default labs email config doesn't work, but applying a tools manifest (which should set tools-mail as router instead of polonium) might help for this specific issue.
Change 264904 had a related patch set uploaded (by Andrew Bogott):
Only check puppet freshness once per day, not 60 times in a row.
Change 264904 merged by Andrew Bogott:
Only check puppet freshness once per day, not 60 times in a row.