Page MenuHomePhabricator

toolforge jobs-framework-emailer: increase reliability
Open, Needs TriagePublic

Description

The daemon can die without Kubernetes noticing, see T317998.

This is an indication that we may need to introduce liveness probes:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Event Timeline

Change 842488 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-emailer@main] emailer: cfg: avoid deadlocks when reading configmap

https://gerrit.wikimedia.org/r/842488

Change 842502 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-emailer@main] emailer: introduce k8s liveness probe support

https://gerrit.wikimedia.org/r/842502

Change 842488 merged by jenkins-bot:

[cloud/toolforge/jobs-framework-emailer@main] emailer: cfg: avoid deadlocks when reading configmap

https://gerrit.wikimedia.org/r/842488

Change 842502 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-emailer@main] emailer: introduce k8s liveness probe support

https://gerrit.wikimedia.org/r/842502

Mentioned in SAL (#wikimedia-cloud-feed) [2022-10-18T10:18:41Z] <wm-bot2> build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-emailer:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-emailer (64385e9) (T320405) - cookbook ran by arturo@nostromo

Change 854536 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-emailer@main] emailer: introduce decorator to factorice endless tasks management

https://gerrit.wikimedia.org/r/854536