Page MenuHomePhabricator

`webservice restart` should do a graceful restart on Kubernetes
Closed, ResolvedPublic

Description

Currently, webservice restart is implemented on Kubernetes by killing the old pod and then waiting for the new one to be brought back up. It would be nicer to ask Kubernetes to gracefully restart the deployment by bringing up a new pod and only killing the old pod once the new one is ready (the equivalent of kubectl rollout restart deployment). This ensures the tool stays up if the new pod can’t start up for some reason, and when combined with k8s probes can even enable no-downtime restarts of tools.

Event Timeline

Change 921620 had a related patch set uploaded (by Lucas Werkmeister; author: Lucas Werkmeister):

[operations/software/tools-webservice@master] Restart Kubernetes webservices more cleanly

https://gerrit.wikimedia.org/r/921620

Here’s what a graceful restart looks like in a kubectl get pods loop:

image.png (698×687 px, 130 KB)

Change 921620 merged by jenkins-bot:

[operations/software/tools-webservice@master] Restart Kubernetes webservices more cleanly

https://gerrit.wikimedia.org/r/921620

LucasWerkmeister claimed this task.

Apparently this is now deployed and works as expected \o/

and when combined with k8s probes can even enable no-downtime restarts of tools.

Note: this turns out to be not completely true at the moment, because you can’t currently combine webservice restart with probes: webservice restart recreates the whole deployment, discarding any probes that users may have patched into it. T341919 requests built-in support for probes in webservice; until then, tools that want no-downtime restarts still have to do kubectl rollout restart deployment. (But I’m still happy this change was merged – it’s still an improvement for tools without probes, as well as a building block for no-downtime restarts to be completed with T341919.)