Page MenuHomePhabricator

give services team permissions to send commands in icinga
Closed, ResolvedPublic

Description

let the services team members have permissions in icinga (beyond the login they have via LDAP group) but in Icinga's cgi.cfg,)

so that they can send commands and do things like acknowledege monitoring alerts, add permanent comments with links to tickets or schedule downtimes

Event Timeline

Dzahn claimed this task.
Dzahn raised the priority of this task from to Needs Triage.
Dzahn updated the task description. (Show Details)
Dzahn subscribed.

This has already been slightly discussed in IRC, but I'll note it on task for the record.

Having rights to acknowledge and silence alerts in icinga has a very large scope, in that it could affect every single server and service. We don't currently (that I am aware of) split our icinga permissions into service level groups. So anyone with the ability to ack one service can do so for all services.

As such, this is on par with a sudo level request, and falls into the requirement of an operations meeting approval. A three day wait without objection will NOT suffice for this request.

(Unless I am mistaken about how icinga's authorization levels work; which is quite possible.)

Daniel pulled up the varying icinga permission groups: P926

Dzahn removed Dzahn as the assignee of this task.Jul 9 2015, 1:59 AM

This was discussed in the operations meeting. The overall concensus seemed to be that full access for all icinga alerting and commands for all services is too large a scope.

This request will have to stall with the sub-task of breaking up our icinga config into services groups, so each dev/engineering/services group can control their own service acknowledgements and maintenance windows within icinga.

fgiunchedi changed the task status from Open to Stalled.Jul 21 2015, 10:29 AM
fgiunchedi subscribed.

@RobH, can we agree on a timeline for this work? We don't want to annoy you or ourselves with unacknowledged icinga alerts.

I was not the blocker on this, but the entire ops team. I think the outcome from the meeting was you guys should have us ack items before you take them down?

Ops needs to split up icinga eventually, but thats a lower priority than other projects. I'm still not sure why you guys cannot have the clinic duty person acknowlege and setup icinga maint windows?

@RobH: The clinic duty person can't be around 24 hours a day, which means that alerts related to work in SF will go unacknowledged when the clinic duty person is in Europe, for example.

I believe other non-ops engineers have access to icinga as well where it is warranted by their work. Could we set up the same for at least one services member?

Are you guys having to do this when there is no op online (and when was this?)

It'll take some work to split icinga up, and it can happen, but no one is working on this project right now. Have you guys needed an ops person to do this, asked, and had no one to help?

Sorry if it seems like I was arguing against services ever getting this, that isnt the case. I'm just asking how big a deal this is, since we'll have to have an opsen refactor our icinga install a bit to make this work.

I've created T107884 as a blocker for this.

NOTE: Mark has approved giving the services team (as individual logins) access to ack/suspend/control icinga monitoring. As we give them the access, we need to make it clear they have FULL icinga access, but otherwise this is granted.

This is changing an access file for icinga, so I'd like to do it (stealing task)

@GWicke: Can you go ahead and list off the users (wikitech users) you have in your team?

@RobH: currently it's primarily eevans, mobrovac & gwicke (myself). There is also Petr (ppchelko), but he isn't doing much deploy work yet.

Dzahn changed the task status from Stalled to Open.Aug 12 2015, 8:43 PM

to achieve this ./modules/icinga/files/cgi.cfg needs to be edited, LDAP changes should not be needed

Pushed live, should work fine.

jcrespo closed subtask Restricted Task as Resolved.Sep 7 2015, 10:45 AM