Page MenuHomePhabricator

Make Labs NFS alerts paging
Closed, ResolvedPublic

Description

They currently just go to Email, they should be paging.

Event Timeline

yuvipanda raised the priority of this task from to Needs Triage.
yuvipanda updated the task description. (Show Details)
yuvipanda added subscribers: yuvipanda, coren, Andrew.

Need to figure out:

  1. Who all should be paged?
  2. What's the paging condition?
yuvipanda added a project: Labs-Sprint-101.

@mark suggested getting the catchpoint alerts to be paging instead, which have been far more accurate.

So according to @chasemp, we can make it paging ourselves in the catchpoint portal! I'm also going to have the NFS test email Ops@

Alright, catchpoing SMS alerts seem to work, for me and @coren, but not for @Andrew. Need to make sure it all works.

Need to make it alert only on two consecutive failures instead of one, and verify that @Andrew is getting only NFS emails.

Alright, it's now alerting only on two consecutive failures instead of one, and @Andrew has been moved to a different contact group called LABS-INFRASTRUCTURE which currently only has NFS alerts. Also emails ops@ only for NFS.

@coren @Andrew do you think this can be marked as done now?

I'm not sure I understand the differentiation between the alerts (NFS/others) between the team members. Why is Andrew only getting NFS emails? (or pages?) What is he not getting?

For catchpoint, everything else is toollabs specific (gridengine / redis / webservice issues)

These are paging from icinga now.