Page MenuHomePhabricator

Document how to act on performance alerts
Closed, ResolvedPublic

Description

We have the alerts today but we are missing documentation of what to check (which dashboard is the start point) and a couple of simple steps of what to check for first. I think we all have the work flow in the team and it would be good to share it with the rest of the world.

Event Timeline

I've started document it here: https://wikitech.wikimedia.org/wiki/Performance/Regressions

There are some work left to be done, I hope to have a first draft ready by Monday so you all can read it.

Peter renamed this task from Document how to act on the synthetic testing alerts to Document how to act on performance alerts.Sep 10 2018, 5:56 PM
Peter added subscribers: Gilles, Krinkle.

@Gilles read and change/add what you want when you have time: https://wikitech.wikimedia.org/wiki/Performance/Regressions and then hand it over to @Krinkle

Krinkle triaged this task as Medium priority.Sep 13 2018, 4:32 PM
Krinkle changed Risk Rating from N/A to default.

I went through an made a few edits, also added samples of some of the alerts that would result in someone checking out the page. Wasn't sure if there are ways that a team would expect to get alerts that aren't via Icinga, though...

LGTM! Made a couple edits.

Krinkle reassigned this task from aaron to Peter.
Krinkle added a subscriber: aaron.