Page MenuHomePhabricator

Create SLI for Blazegraph uptime
Open, HighPublic

Description

As a WDQS maintainer, I want to know what Blazegraph uptime is, so I can better prioritize maintenance work and/or scaling work that needs to happen.

AC:

  • dashboard to track Blazegraph uptime is available

Event Timeline

Update lag is our KPI at the moment. We also need better metrics around whether the service is usable. We need to align expectations between other SRE teams and ourselves, we tend to have less priority around outages.

MPhamWMF triaged this task as High priority.May 2 2022, 3:54 PM
MPhamWMF moved this task from Incoming to Operations/SRE on the Wikidata-Query-Service board.
Gehel removed bking as the assignee of this task.Mon, May 30, 3:26 PM
Gehel added a subscriber: bking.