Page MenuHomePhabricator

Define SLIs and SLOs for function-* services
Closed, ResolvedPublic

Description

Per the Wikimedia Services Policy, the function-{orchestrator,evaluator} services need to have SLIs and SLOs defined. These should be documented on Wikitech.

Once they are defined, we need to make sure the services expose the right set of metrics for measuring these SLIs; see T307700.

The Google SRE book has a useful chapter on this topic.

Event Timeline

The bar for reliability and efficiency should be set by the primary use-case for Wikifunctions, which is (AIUI) to have functions that generate content fragments in Wikipedia articles.

To get a sense of what the expectations are, we can look to the design of the MediaWiki job queue and the support for Lua template modules.

@maryyang has your observability doc been moved over to Wikitech yet? If not, can you migrate it? @Sannita or @Quiddity can assist.