Page MenuHomePhabricator

implement paging for non-ops teams
Closed, ResolvedPublic

Description

In Icinga we have custom notification groups to just mail certain people in other teams for certain services (as opppsed to ops getting all notifications by default). But we still want to add other notification methods besides email. So people can get real SMS like ops does but only for their specific services. Currently there is just that one special group "sms" that's a catch-all. We looked before so it wasn't super trivial or we'd already have it, afair. But demand is still there:


12:25 < mutante> there is one special contact group called "sms"
12:25 < mutante> if you are in there you get paged
12:25 < mutante> but then you get all the ops pages currently
12:25 < ostriches> Yeah, that's not what I want. I want something like sms-releng
12:25 < ostriches> (or able to trigger sms for random groups)
12:25 < ostriches> Whichever is possible
..
12:26 < James_F> I thought we had service groups?
12:27 < James_F> I only get (got?) Parsoid pages, not general Ops ones.
..
12:27 < mutante> we have service groups for teams and stuff
12:27 < mutante> with email notification
12:27 < James_F> Ah, but the groups are not for SMS? OK.
12:27 < mutante> but we need to add the SMS notification method 
..
12:31 < greg-g> how hard is it to add sms-groups?
12:31 < greg-g> seems like an obvious thing for services, no?
12:32 < mutante> not easy enough
..

Event Timeline

Dzahn updated the task description. (Show Details)
Dzahn added a subscriber: greg.
Gehel triaged this task as Medium priority.Jul 22 2016, 8:49 AM
Gehel added a subscriber: Gehel.

This is complex enough that it does require some time and thinking. I am pretty sure other teams besides services would be interested in being paged (I can think at least at Maps and Cirrus). Pushing to get project teams more implicated in the production aspect of their project always seems like a good thing to me (I'm not implying that they are not implicated at the moment, just that giving more visibility would be even better).

Yeah, just to be clear, when I don't capitalize "services" I mean all the things that look like services, not just the things that Services team does/owns :) (so, I was implicitly including ES, tilerator, etc)

do we want to keep the same scope for ICINGA? or consider our other paging tools?

Dzahn renamed this task from implement icinga paging for non-ops teams to implement paging for non-ops teams.Tue, Apr 20, 7:50 PM

removed the word "icinga" from the ticket title. I think it is just about "paging for subgroups" / "paging for teams outside SRE" but doesn't really matter if it's Icinga or not.

That being said, It sounds like Icinga will still be around for quite some time even if individual checks are moving to alertmanager.

fgiunchedi claimed this task.
fgiunchedi added a subscriber: fgiunchedi.

We have implemented paging for non-ops teams in VO/splunk oncall, within icinga and alertmanager has that capability as well. I'm boldly resolving the task, but feel free to reopen!