Page MenuHomePhabricator

Build MediaWiki images for kubernetes on the deployment servers
Closed, ResolvedPublic3 Estimated Story Points

Description

Our deployment servers is where the code for mediawiki is "prepared", including:

  • mediawiki releases
  • security patches
  • private settings
  • localization cache

using scap. We would like to allow building our images outside of CI and as an integral part of preparing the code for deployment.

To this end, I think the easiest way to go about this is:

  • Install docker on the deployment servers (check for space/filesystem constraints)
  • Install the docker-pusher wrapper to allow pushing images via sudo from a mwbuilder user
  • Install a copy of mediawiki/tools/release under /srv/mwbuilder/release
  • Create a sudo rule allowing people in the deployment-ci-admins group to run /usr/local/bin/update-mediawiki-tools-release as mwbuilder (to allow updating the tools/release source code) - this is the same group as contint-admins.
  • Create a sudo rule allowing people in the deployment group to run /usr/bin/make -C /srv/mwbuilder/release/make-container-image -f Makefile *
  • Make scap (eventually, when it will be how we fetch code in production) and/or a git hook trigger the rebuild sudo'ing to the correct user
  • Possibly write the file with the image versions to consume for deployments.
  • Reduce verbosity of image build process. It's overwhelming.
  • Measure impact of image build time (first train of the wiki and followup commits).
  • Figure out interaction with sync-dir and sync-file (possibly deprecating these commands).
  • Set build_mw_container_image to True in /etc/scap.cfg

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptDec 14 2021, 7:44 AM
Joe triaged this task as Medium priority.

Change 749508 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] deployment_server: add docker engine

https://gerrit.wikimedia.org/r/749508

Change 749552 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] profile:k8s::deployment_server::mediawiki: split in subprofiles

https://gerrit.wikimedia.org/r/749552

Change 749553 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] kubernetes::deployment_server::mediawiki: add builder user/role

https://gerrit.wikimedia.org/r/749553

Change 749508 merged by Giuseppe Lavagetto:

[operations/puppet@production] deployment_server: add docker engine

https://gerrit.wikimedia.org/r/749508

Change 749552 merged by Giuseppe Lavagetto:

[operations/puppet@production] profile:k8s::deployment_server::mediawiki: split in subprofiles

https://gerrit.wikimedia.org/r/749552

Change 749553 merged by Giuseppe Lavagetto:

[operations/puppet@production] kubernetes::deployment_server::mediawiki: add builder user/role

https://gerrit.wikimedia.org/r/749553

Change 751166 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] deployment_server: fix permissions for mwbuilder/other

https://gerrit.wikimedia.org/r/751166

Change 751166 merged by Giuseppe Lavagetto:

[operations/puppet@production] deployment_server: fix permissions for mwbuilder/other

https://gerrit.wikimedia.org/r/751166

Joe removed Joe as the assignee of this task.Jan 4 2022, 4:28 PM
Joe updated the task description. (Show Details)

The deployment servers should now be able to build the images for mediawiki. I'll de-assign the task from myself for now, please ping me If I'm needed for the next steps!

Change 762007 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/scap@master] build container image during scap sync-*

https://gerrit.wikimedia.org/r/762007

Change 762007 abandoned by Ahmon Dancy:

[mediawiki/tools/scap@master] Optionally build mw container image during scap sync-*

Reason:

look at https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/762546 instead

https://gerrit.wikimedia.org/r/762007

Change 762546 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/scap@master] Optionally build mw container image during scap sync-*

https://gerrit.wikimedia.org/r/762546

Change 762546 merged by jenkins-bot:

[mediawiki/tools/scap@master] Optionally build mw container image during scap sync-*

https://gerrit.wikimedia.org/r/762546

Change 763808 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/release@master] Mods to webserver image build for scap

https://gerrit.wikimedia.org/r/763808

Change 763808 merged by jenkins-bot:

[mediawiki/tools/release@master] Mods to webserver image build for scap

https://gerrit.wikimedia.org/r/763808

Change 763846 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/train-dev@master] Kill off build-mw-image-loop.py

https://gerrit.wikimedia.org/r/763846

Change 763609 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/scap@master] Add release_repo_build_and_push_images_cmd config parameter

https://gerrit.wikimedia.org/r/763609

@Joe I tried running this today but it failed:

dancy@deploy1002$ sudo -u mwbuilder /usr/bin/make -C /srv/mwbuilder/release/make-container-image -f Makefile build-and-push-all-images \
GIT_BASE=https://gerrit.wikimedia.org/r/ BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-multiversion webserver_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-webserver in /srv/mwbuilder/release/make-container-image
make: Entering directory '/srv/mwbuilder/release/make-container-image'
/usr/bin/make -C multiversion-base
make[1]: Entering directory '/srv/mwbuilder/release/make-container-image/multiversion-base'
docker build \
        --pull \
        --build-arg "http_proxy=" \
        --build-arg "https_proxy=" \
        --build-arg "MV_BASE_PACKAGES=" \
        --build-arg "MV_EXTRA_CA_CERT=" \
        -t multiversion-base .
ERRO[0000] failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied
context canceled

It looks like the mwbuilder is not in the docker group.

dancy@deploy1002$ id mwbuilder
uid=493(mwbuilder) gid=495(mwbuilder) groups=495(mwbuilder)

Change 763609 merged by jenkins-bot:

[mediawiki/tools/scap@master] Add release_repo_build_and_push_images_cmd config parameter

https://gerrit.wikimedia.org/r/763609

Change 763846 merged by jenkins-bot:

[mediawiki/tools/train-dev@master] Kill off build-mw-image-loop.py

https://gerrit.wikimedia.org/r/763846

dancy changed the task status from Open to In Progress.Feb 23 2022, 10:04 PM
dancy raised the priority of this task from Medium to High.

@Joe I tried running this today but it failed:

dancy@deploy1002$ sudo -u mwbuilder /usr/bin/make -C /srv/mwbuilder/release/make-container-image -f Makefile build-and-push-all-images \
GIT_BASE=https://gerrit.wikimedia.org/r/ BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-multiversion webserver_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-webserver in /srv/mwbuilder/release/make-container-image
make: Entering directory '/srv/mwbuilder/release/make-container-image'
/usr/bin/make -C multiversion-base
make[1]: Entering directory '/srv/mwbuilder/release/make-container-image/multiversion-base'
docker build \
        --pull \
        --build-arg "http_proxy=" \
        --build-arg "https_proxy=" \
        --build-arg "MV_BASE_PACKAGES=" \
        --build-arg "MV_EXTRA_CA_CERT=" \
        -t multiversion-base .
ERRO[0000] failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied
context canceled

It looks like the mwbuilder is not in the docker group.

dancy@deploy1002$ id mwbuilder
uid=493(mwbuilder) gid=495(mwbuilder) groups=495(mwbuilder)

yes I made the mistake to think we just needed docker-pusher, which is not all we need. Let me amend that.

Change 765494 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/puppet@production] deployment_server: re-add mwbuilder to the docker group

https://gerrit.wikimedia.org/r/765494

Change 765494 merged by Giuseppe Lavagetto:

[operations/puppet@production] deployment_server: re-add mwbuilder to the docker group

https://gerrit.wikimedia.org/r/765494

aaand should be fixed now.

Confirmed. Thanks @Joe!

Change 765624 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[operations/puppet@production] scap.cfg.erb: Add container image build settings

https://gerrit.wikimedia.org/r/765624

Joe changed the status of subtask T302464: Deploy Scap version 4.4.1 from Open to In Progress.Mar 1 2022, 6:46 AM

Change 765624 merged by RLazarus:

[operations/puppet@production] scap.cfg.erb: Add container image build settings

https://gerrit.wikimedia.org/r/765624

Change 769097 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[mediawiki/tools/scap@master] Redirect image build output to a file

https://gerrit.wikimedia.org/r/769097

Change 769097 merged by jenkins-bot:

[mediawiki/tools/scap@master] Redirect image build output to a file

https://gerrit.wikimedia.org/r/769097

Today I disabled the build-mw-container-image and build-webserver-image jobs on https://releases-jenkins.wikimedia.org/.

thcipriani subscribed.

I believe this is happening now. Just noticed that this task was lingering.