Page MenuHomePhabricator

Add health monitoring as required for deployment: use service-runner
Closed, ResolvedPublic

Description

In order to deploy we need some some health monitoring. It seems the recognised way to do this is by using service-runner. It will give us process monitoring as well as process restarts.


Left for history
Notes
This ticket is about allowing someone to monitor if the service is in a good state, but so far all questions are open (what is good? What is bad?)

One option could be to see https://github.com/godaddy/terminus#user-content-with-express

Seems like we can configure where the orchestrator looks for liveness and readiness in the helm chart see here for an example: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/392619/4/_scaffold/values.yaml

The format that is looked for seems to be defined by: https://www.mediawiki.org/wiki/Service-checker

We would probably get the required monitoring for free with the "service-runner": https://github.com/wikimedia/service-runner

Event Timeline

Pablo-WMDE triaged this task as Normal priority.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 12 2019, 3:55 PM
Lea_WMDE updated the task description. (Show Details)Feb 25 2019, 11:07 AM
Lea_WMDE changed the task status from Open to Stalled.
Lea_WMDE added a subscriber: Lea_WMDE.

This is stalled until we have more info about the operation's side of the SSR service in production

Tarrow updated the task description. (Show Details)Feb 27 2019, 11:39 AM
Tarrow renamed this task from Add terminus to allow healthcheck by container orchestrator to Add health monitoring as required for deployment.Feb 27 2019, 4:04 PM
Tarrow updated the task description. (Show Details)
Tarrow claimed this task.Mar 1 2019, 2:06 PM
Tarrow moved this task from Backlog to Reading on the Wikidata-Termbox-Hike board.
Tarrow moved this task from To Do to Doing on the Wikidata-Termbox-Iteration-10 board.
Tarrow changed the task status from Stalled to Open.Mar 1 2019, 4:01 PM
Tarrow renamed this task from Add health monitoring as required for deployment to Add health monitoring as required for deployment: use service-runner.Mar 5 2019, 9:30 AM
Tarrow updated the task description. (Show Details)

Change 494480 had a related patch set uploaded (by Tarrow; owner: Tarrow):
[wikibase/termbox@master] Add and Configure service-runner

https://gerrit.wikimedia.org/r/494480

Change 494480 merged by jenkins-bot:
[wikibase/termbox@master] Add and Configure service-runner

https://gerrit.wikimedia.org/r/494480

Tarrow closed this task as Resolved.Mar 18 2019, 10:22 AM