Page MenuHomePhabricator

Ensure the code is deployed to mediawiki on k8s when it is deployed to production
Closed, ResolvedPublic

Description

Currently, updating the image version for mediawiki on kubernetes follows the standard gitops procedure we follow for every other service, so we need to post and merge a patch to deployment-charts. This is inconvenient given we already have a deployment procedure for mediawiki outside of kubernetes.

While the topic of deployments on kubernetes should merit a separate task later on, for now our goal is:

  • Keep deploying to the mwdebug cluster when code is deployed in production
  • Minimal disruption for deployers.

My preferred interim solution is as follows:

  • once gate-and-submit happens, an image rebuild is triggered, and new versions of the images are published
  • It is somehow possible to gather the tag of the image corresponding to the sha1 of the last commit in /srv/mediawiki-staging on the deployment server
  • We regenerate, every time we do a release (either via a scap "extension" or a git receive hook?) a helmfile values file with the correct image versions
  • we deploy in eqiad and codfw to the mwdebug cluster automatically using helmfile via a script that can be invoked either by scap or by a systemd timer

We can get into the details of every step above, but in general I'd like to implement most of this stuff in scap, so that sync and sync-file take care of the deployment to kubernetes automatically. The deployment script, though, should be probably separated anyways, and for now it can even be a couple lines of bash.

Event Timeline

As far as I know, we already generate an image for every +2 in mediawiki-config, so I'll assume that part is already done.

Change 708771 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] profile::kubernetes::deployment_server: add automation for mw on k8s

https://gerrit.wikimedia.org/r/708771

I uploaded a very simplistic script that could be used as a systemd timer, or invoked by hand, on the deployment server. Right now it needs to run as root, only to gather the correct credentials, but we could make it so it's callable by any deployer with

sudo -u mwdeploy deploy-mwdebug

if needed - although we might stumble upon permission problems. Before doing that, I'd like to get a better way to fetch the correct image version than "get whatever the latest image is".

Change 708771 merged by Giuseppe Lavagetto:

[operations/puppet@production] profile::kubernetes::deployment_server: add automation for mw on k8s

https://gerrit.wikimedia.org/r/708771

Joe triaged this task as High priority.Jul 30 2021, 3:24 PM

Change 709069 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] mwdebug: also source /etc/helmfile-defaults/mediawiki/releases.yaml

https://gerrit.wikimedia.org/r/709069

Change 709069 merged by Giuseppe Lavagetto:

[operations/deployment-charts@master] mwdebug: also source /etc/helmfile-defaults/mediawiki/releases.yaml

https://gerrit.wikimedia.org/r/709069

Change 709378 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] deploy-mwdebug: run every 5 minutes

https://gerrit.wikimedia.org/r/709378

Change 709378 merged by Giuseppe Lavagetto:

[operations/puppet@production] deploy-mwdebug: run every 5 minutes

https://gerrit.wikimedia.org/r/709378

The code should now be deployed when merged/built into an image within 5 minutes. I think this is an acceptable stopgap for re-opening mwdebug on k8s to all sites. We do need a better solution, for which I'll need help from my Release-Engineering-Team buddies though :)

thcipriani added a subscriber: thcipriani.

The code should now be deployed when merged/built into an image within 5 minutes. I think this is an acceptable stopgap for re-opening mwdebug on k8s to all sites. We do need a better solution, for which I'll need help from my Release-Engineering-Team buddies though :)

Let's use a different task for that: T279322: Design m8s deployment workflows and tooling