Page MenuHomePhabricator

Deploy Dispatch for SRE incident workflow automation
Closed, DeclinedPublic

Description

Incident response workflow automation has been a source of discomfort for SREs. Our current manual process consists of creating documents and updating several chats and data sources in addition to troubleshooting. This effort aims to reduce friction for SRE responders and incident coordinators during incidents and free up much-needed time to focus on addressing the problems.

We have selected Netflix's Dispatch as our tool of choice for this project. https://github.com/Netflix/dispatch

This is the parent tracking task

In rough sequence:

Related Objects

Event Timeline

herron triaged this task as Medium priority.Jul 18 2022, 2:21 PM
herron created this task.

The team met today and spent some time planning we made an initial decision to use WMCS will be our initial deployment target; this will allow us to move fast and keep some reasonable separation from production.

We can, after a quarter or two of using the tool, discuss another hosting target if we desire a better OOB solution.

lmata renamed this task from Deploy Dispatch for SRE incident tracking to Deploy Dispatch for SRE incident workflow automation.Aug 9 2022, 2:30 PM
lmata changed the task status from Open to In Progress.
lmata assigned this task to herron.
lmata raised the priority of this task from Medium to Needs Triage.
lmata updated the task description. (Show Details)
lmata triaged this task as High priority.Aug 9 2022, 2:35 PM

This task our team's top priority for the quarter.

herron removed herron as the assignee of this task.Jan 5 2023, 5:59 PM
herron added a project: User-herron.
herron added a subscriber: BCornwall.

Closing as dispatch has been ruled out as an option: See T308467 for follow-up discussion of where we're going.