Page MenuHomePhabricator

Add health monitoring as required for deployment: use service-runner
Closed, ResolvedPublic

Description

In order to deploy we need some some health monitoring. It seems the recognised way to do this is by using service-runner. It will give us process monitoring as well as process restarts.


Left for history
Notes
This ticket is about allowing someone to monitor if the service is in a good state, but so far all questions are open (what is good? What is bad?)

One option could be to see https://github.com/godaddy/terminus#user-content-with-express

Seems like we can configure where the orchestrator looks for liveness and readiness in the helm chart see here for an example: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/392619/4/_scaffold/values.yaml

The format that is looked for seems to be defined by: https://www.mediawiki.org/wiki/Service-checker

We would probably get the required monitoring for free with the "service-runner": https://github.com/wikimedia/service-runner

Event Timeline

Pablo-WMDE created this task.
Lea_WMDE changed the task status from Open to Stalled.Feb 25 2019, 11:07 AM
Lea_WMDE updated the task description. (Show Details)
Lea_WMDE subscribed.

This is stalled until we have more info about the operation's side of the SSR service in production

Tarrow renamed this task from Add terminus to allow healthcheck by container orchestrator to Add health monitoring as required for deployment.Feb 27 2019, 4:04 PM
Tarrow updated the task description. (Show Details)
Tarrow moved this task from Backlog to Reading on the Wikidata-Termbox board.
Tarrow edited projects, added Wikidata-Termbox-Iteration-10; removed Wikidata-Termbox.
Tarrow moved this task from To Do to Doing on the Wikidata-Termbox-Iteration-10 board.
Tarrow changed the task status from Stalled to Open.Mar 1 2019, 4:01 PM
Tarrow renamed this task from Add health monitoring as required for deployment to Add health monitoring as required for deployment: use service-runner.Mar 5 2019, 9:30 AM
Tarrow updated the task description. (Show Details)

Change 494480 had a related patch set uploaded (by Tarrow; owner: Tarrow):
[wikibase/termbox@master] Add and Configure service-runner

https://gerrit.wikimedia.org/r/494480

Change 494480 merged by jenkins-bot:
[wikibase/termbox@master] Add and Configure service-runner

https://gerrit.wikimedia.org/r/494480