Page MenuHomePhabricator

Convert zuul to use scap
Closed, ResolvedPublic

Description

Zuul is deployed on contint1001 using a Debian package intended for Jessie. The whole process is unnecessarily complicated and predate a lot of improvement we have accomplished over the last few years with regard to deploying python application.

We need to deploy it using Scap and using wheels to satisfy the requirements.

A potentially helpful overview: https://docs.google.com/drawings/d/1WTnZGO_2WS8OkWUVAVwpd9eg3nXKWVQvDPoUyoPTC1s/edit

Legacy deployment

The repository is https://gerrit.wikimedia.org/g/integration/zuul Our code is a fork of upstream ~ version 2.5 and comes with hotfixes and backports and has Debian packaging information.

Since some of the required python modules are either missing from Jessie or to new, we have the Debian package to fetch dependencies from pypi using dh_virtualenv. Hence the package building requires network access. Zuul is then running from a virtualenv. Some dependencies are fullfilled using the regulard debian/control Depends: field and are thus installed by apt when installing the package.

The branches that matters:

patch-queue/debian/jessie-wikimediaFork of upstream + code hotfixes and HEAD of the repository
debian/jessie-wikimediaHas the ./debian directory for packaging

The process is roughly:

  • patches are pilled up in patch-queue/debian/jessie-wikimedia
  • Using git buildpackage patch manager, the patches are exported in as a Debian patch serie into the branch debian/jessie-wikimedia. That is done using gbp pq export
  • The package is build using DIST=jessie-wikimedia gbp buildpackage which requires pbuilder hooks installed from operations/puppet.git modules/package_builder and a cowbuilder environment to be setup on the build host.
  • pbuilder install the build dependencies then invoke dh_virtualenv reusing system packages and fetch the list in requirements.txt

The outcome is a virtualenv deployed under /usr/share/python/zuul/.

Checking on contint1001, the embedded python modules are:

$ /usr/share/python/zuul/bin/pip list --local
APScheduler (3.0.6)
futures (3.0.5)
gear (0.7.0)
gitdb2 (2.0.5)
GitPython (2.1.11)
lockfile (0.12.2)
pip (1.5.6)
python-daemon (2.0.6)
setuptools (5.5.1)
smmap2 (2.0.5)
statsd (2.1.2)
tzlocal (1.2.2)
zuul (2.5.1-wmf11)

debian/control has a list of the requirements. Some would fail on Stretch/Buster such as python-voluptuous which would be to new for our version of Zuul. I guess we will have to proceed by trial and errors to find the proper set of requirements.

New deployment

@Paladox has made an attempt to convert to use wheels. It might not match the systems we have put in place for other software. A few repositories worth looking at for inspiration:

I guess people from SRE can be involved to help start on the proper rails.

Scap target and services

The Zuul Debian package ships two services which are configured via puppet.git.

zuul is the main daemon, it is only active on contint1001. The instance on contint2001 is masked via systemd.

zuul-merger they process git merge commit of the patches against the target branch. There is one running on each of contint1001 and contint2001.

The Zuul daemon hosts a tiny web server to expose it's internal status as a json file. That is exposed publicly as integration.wikimedia.org/zuul/status.json and pass through an Apache proxy forwarding requests made for https://integration.wikimedia.org/zuul/ .

Migration

Once the Debian package is build and a Buster server is available, the new version can be installed there. Then:

  • Change the Apache proxy for integration.wikimedia.org to have requests to zuul to be send to that new server instead of the locally running zuul.
  • Stop the old zuul
  • Bring back the new one
  • have the Jenkins Gearman client to be pointed to that new server

And that might do it.

Related Objects

StatusSubtypeAssignedTask
StalledNone
ResolvedNone
Resolvedakosiaris
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
InvalidJdforrester-WMF
ResolvedMoritzMuehlenhoff
ResolvedKrinkle
ResolvedKrinkle
Resolvedhashar
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
DeclinedJdforrester-WMF
DuplicateNone
ResolvedMilimetric
ResolvedMilimetric
ResolvedLadsgroup
Resolvedakosiaris
DeclinedNone
Resolved Mholloway
DuplicateNone
ResolvedNone
ResolvedNone
DeclinedNone
ResolvedMSantos
DuplicateNone
Resolvedjeena
ResolvedJdforrester-WMF
ResolvedJdrewniak
DuplicateNone
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedMoritzMuehlenhoff
Resolvedhashar
Resolvedhashar
Resolved mmodell
Resolvedhashar
ResolvedJoe

Event Timeline

Per chat with @thcipriani im going to work on switching zuul to use scap.

(Assigning to me so i can implement this first)

Paladox triaged this task as Medium priority.Feb 7 2019, 2:51 AM

I have this some what working

paladox@gerrit-test:~/zuul2$ /usr/bin/zuul-server2 --version
Zuul version: 2.5.2.dev17

We will have to use git-fat for the python deps.

We will also have to symnlink dist-packages to the zuul package.

Mentioned in SAL (#wikimedia-releng) [2019-02-07T19:05:08Z] <paladox> created integration/zuul/wheels gerrit repo for T215458

Mentioned in SAL (#wikimedia-releng) [2019-02-07T19:09:31Z] <paladox> created integration/zuul/build gerrit repo for T215458

Change 489012 had a related patch set uploaded (by Paladox; owner: Paladox):
[operations/puppet@production] [wip] zuul: Convert to using scap

https://gerrit.wikimedia.org/r/489012

Change 489012 had a related patch set uploaded (by Paladox; owner: Paladox):
[operations/puppet@production] zuul: Convert to using scap

https://gerrit.wikimedia.org/r/489012

SRE maintains two python projects deployed by scap which use Docker containers to build the wheels for the target distribution. One can seek inspiration from them:

Another potential option would be to have Zuul in a Docker container and run the scheduler and mergers on Kubernetes. But that is potentially a lot more breaking changes.

Sorry @Paladox, not meaning to step on your toes here but I am assigned to move this forward.

mmodell raised the priority of this task from Medium to High.Feb 5 2020, 6:28 PM
mmodell changed the task status from Open to Stalled.Feb 19 2020, 6:19 PM

blocked because I need to consult with @hashar before doing this

hashar changed the task status from Stalled to Open.Mar 5 2020, 8:55 AM

@mmodell and I have a meeting today about it :)

The code to deploy is in integration/zuul in branch patch-queue/debian/jessie-wikimedia. It is a fork of upstream code with a dozens or so of custom patches on top of it.

Those patches are managed using git-buildpackage gb pq which automatically export our patches to a Debian patch serie in the branch debian/jessie-wikimedia under the directory ./debian/patches. That makes maintenance of patch rather "easy".

From discussion with @mmodell , scap v3 needs a separate git repository (eg: integration/zuul/deploy) which then has the code to deploy registered as a submodule.


List of requirements are in the task description:

Checking on contint1001, the embedded python modules are:

$ /usr/share/python/zuul/bin/pip list --local
APScheduler (3.0.6)
futures (3.0.5)
gear (0.7.0)
gitdb2 (2.0.5)
GitPython (2.1.11)
lockfile (0.12.2)
pip (1.5.6)
python-daemon (2.0.6)
setuptools (5.5.1)
smmap2 (2.0.5)
statsd (2.1.2)
tzlocal (1.2.2)
zuul (2.5.1-wmf11)

+ other dependencies of the Debian package (defined in debian/control).

The first issue I see is that we don't have a python2-build-* image in the registry.

Eek. I guess we can try with python 3. Our Zuul version might support it thought I haven't verified that :-( At least it has six as a dependency which is an indication that at least some of the code has been worked toward supporting python 3.

Change 577846 had a related patch set uploaded (by 20after4; owner: 20after4):
[integration/zuul/deploy@master] Scap3 deploy repo for zuul

https://gerrit.wikimedia.org/r/577846

Change 579290 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/docker-images/production-images@master] python-build: add gcc etc + libssl-dev

https://gerrit.wikimedia.org/r/579290

Change 579290 abandoned by Hashar:
python-build: add gcc etc libssl-dev

Reason:
From a discussion with Giuseppe, it seems we should instead have an intermediate build container in which we inject the build dependencies and build wheels from that.

An example is operations/software/homer/deploy:

https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/homer/deploy/ /master/Dockerfile.build

https://gerrit.wikimedia.org/r/579290

Change 579587 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] zuul: provision the scap repository

https://gerrit.wikimedia.org/r/579587

Change 580128 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/docker-images/production-images@master] Add an image for python2 app based on Buster

https://gerrit.wikimedia.org/r/580128

Change 582895 had a related patch set uploaded (by Hashar; owner: Hashar):
[labs/private@master] Add fake ssh key pair for zuul scap deployment

https://gerrit.wikimedia.org/r/582895

Change 582895 merged by Hashar:
[labs/private@master] Add fake ssh key pair for zuul scap deployment

https://gerrit.wikimedia.org/r/582895

Change 580128 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] Add an image for python2 app based on Buster

https://gerrit.wikimedia.org/r/580128

Change 577846 merged by Jforrester:
[integration/zuul/deploy@master] Scap3 deploy repo for zuul

https://gerrit.wikimedia.org/r/577846

Change 489012 abandoned by Hashar:
zuul: Convert to using scap

Reason:
We are going to deploy Zuul using an intermediate repo: integration/zuul/deploy :)

https://gerrit.wikimedia.org/r/489012

I say the conversion is done. The switch will be handled in the parent task T224591: Migrate contint* hosts to Buster.

Change 579587 merged by Dzahn:
[operations/puppet@production] zuul: provision the scap repository

https://gerrit.wikimedia.org/r/579587

Change 587780 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/zuul/deploy@master] scap: drop git_rev: master

https://gerrit.wikimedia.org/r/587780

Change 587780 merged by 20after4:
[integration/zuul/deploy@master] scap: drop git_rev: master

https://gerrit.wikimedia.org/r/587780