Page MenuHomePhabricator

[ci,operations-puppet] upgrade to tox 4 in order to detect changed requirement files
Open, MediumPublic

Description

When running CI jobs for the operations-puppet repository, tox comes pre-installed and has run once inside the docker image:

https://gerrit.wikimedia.org/r/plugins/gitiles/integration/config/+/refs/heads/master/dockerfiles/operations-puppet/Dockerfile.template

Then when running again for the tests, if you did any modifications to files included with -r/path/to/file those are not picked up.

It seems due to a bug in tox <4:

https://github.com/tox-dev/tox/issues/149

So this task is to keep track and upgrade (whenever possible) to tox>4.

Note that we are installing tox from the debian repos, so it's tied to them.

As a workaround, you can change the path included with -r (ex. from relative to full path), or add the extra dependency at the tox level (duplicates things, but forces it to upgrade).

NOTE: Below are failures reported when running with tox==4.8.0.

py2-pep8 && nagios_common

py2-pep8: skipped because could not find python interpreter with spec(s): python2.7
py2-pep8: SKIP ⚠ in 0.12 seconds
nagios_common: skipped because could not find python interpreter with spec(s): python2.7
  nagios_common: SKIP (0.12 seconds)
  evaluation failed :( (0.25 seconds)

python2.7 is in the container`:

$ podman run --rm -it --entrypoint=python2.7 docker-registry.wikimedia.org/releng/operations-puppet:0.9.0 --version
Python 2.7.16

I guess tox 4.8.0 no more recognizes python2.7 and thus we can't use it anymore. Then I am not sure whether anything in operations/puppet still relies on Python 2.7.

For py2-pep8 it is feed files having a python2 shebang which yields only two scripts:

git grep -n -P^#!.*python2
modules/admin/files/home/ori/.binned/py:1:#!/usr/bin/env python2
modules/pybal/files/check_pybal_ipvs_diff.py:1:#!/usr/bin/env python2

tlslua

tslua: failed with /bin/sh (resolves to /bin/sh) is not allowed, use allowlist_externals to allow it
tslua: FAIL ✖ in 0.02 seconds

My guess is /bin/sh was previously always in the allow list

wmcs-replica_cnf_api_service

That one passes but the suite writes to stderr which is now rendered by tox with a new line and colored in red which might be confusing:

Before (tox 3.7.0):

tox3.7.0_puppet_wmcs-replica_cnf_api_service.png (260×947 px, 59 KB)

After (tox 4.8.0):

tox4.8.0_puppet_wmcs-replica_cnf_api_service.png (260×947 px, 44 KB)

Event Timeline

dcaro triaged this task as Medium priority.Aug 29 2023, 12:52 PM
dcaro created this task.
dcaro added a subscriber: hashar.

Change 953560 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] dockerfiles: operations-puppet: upgrade tox to 4.4.10

https://gerrit.wikimedia.org/r/953560

When running the currently used image, all tests pass with tox 3.7.0:

$ docker run --rm -it --workdir=/srv/workspace/puppet --entrypoint=tox docker-registry.wikimedia.org/releng/operations-puppet:0.8.12
 commit-message: commands succeeded
  admin: commands succeeded
  adminschema: commands succeeded
  py2-pep8: commands succeeded
  py3-pep8: commands succeeded
  mtail: commands succeeded
  nagios_common: commands succeeded
  grafana: commands succeeded
  sonofgridengine: commands succeeded
  tslua: commands succeeded
  smart_data_dump: commands succeeded
  alerts: commands succeeded
  openstack_puppetenc: commands succeeded
  wmcs: commands succeeded
  wmcs-replica_cnf_api_service: commands succeeded
  congratulations :)

But when I build the new image with tox 4.8.0 there are more than a few failures. I have amended the task description to list them all.

Change 954261 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] tox: allow /bin/sh for tslua environment

https://gerrit.wikimedia.org/r/954261

Change 954264 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] nagios_common: run tests with Python 3

https://gerrit.wikimedia.org/r/954264

For python2, it looks like the issue comes from virtualenv v20.22.0 (2023-04-19):

Features - 20.22.0

Drop support for creating Python <=3.6 (including 2) interpreters.

And if I downgrade it to 20.21.0 (<20.22.0), it can find python2.7 again, but it can be pinned in the container image ( https://gerrit.wikimedia.org/r/c/integration/config/+/953560/4..5/dockerfiles/operations-puppet/Dockerfile.template ).

Change 954272 had a related patch set uploaded (by Hashar; author: David Caro):

[operations/puppet@production] wmcs-replica-cnf: redirects mock services stderr to stdout

https://gerrit.wikimedia.org/r/954272

Change 954272 merged by David Caro:

[operations/puppet@production] wmcs-replica-cnf: redirects mock services stderr to stdout

https://gerrit.wikimedia.org/r/954272

Change 954261 merged by David Caro:

[operations/puppet@production] tox: allow /bin/sh for tslua environment

https://gerrit.wikimedia.org/r/954261

Change 953560 merged by jenkins-bot:

[integration/config@master] dockerfiles: operations-puppet: upgrade tox to 4.8.0

https://gerrit.wikimedia.org/r/953560

Change 954284 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: update puppet test job to use tox 4

https://gerrit.wikimedia.org/r/954284

Mentioned in SAL (#wikimedia-operations) [2023-09-01T12:44:29Z] <hashar> Updated CI Job operations-puppet-tests-buster-docker to use tox 4.8.0 # T345152

hashar claimed this task.

I have switched the job to the new image which uses virtualenv==20.21.0 and tox==4.8.0. Any upgrades past those versions would require dropping support for python 2.7 and we are not there yet.

Change 954284 merged by jenkins-bot:

[integration/config@master] jjb: update puppet test job to use tox 4

https://gerrit.wikimedia.org/r/954284

Mentioned in SAL (#wikimedia-operations) [2023-09-01T12:54:34Z] <hashar> Build /releng/operations-puppet:0.9.0 image and now updated the CI Job operations-puppet-tests-buster-docker to use tox 4.8.0 # T345152

Change 954264 merged by Majavah:

[operations/puppet@production] nagios_common: run tests with Python 3

https://gerrit.wikimedia.org/r/954264

Change 954297 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] taskgen: update for tox 4 syntax

https://gerrit.wikimedia.org/r/954297

Mentioned in SAL (#wikimedia-operations) [2023-09-01T12:54:34Z] <hashar> Build /releng/operations-puppet:0.9.0 image and now updated the CI Job operations-puppet-tests-buster-docker to use tox 4.8.0 # T345152

I have rolled the image back to 0.8.12 this needs more testing and coordination with changes to rake_modules/taskgen.rb.

Mentioned in SAL (#wikimedia-releng) [2023-09-01T15:08:54Z] <jbond> Updated CI Job operations-puppet-tests-buster-docker to use 0.8.12 (tox 3.*) # T345152

This had some fall out such as tox -e py2-pep8 being invoked with positional arguments (list of python files being considered python2) (fixed by https://gerrit.wikimedia.org/r/c/operations/puppet/+/954297/1/rake_modules/taskgen.rb ).

@jbond kindly rolled back the Jenkins job update with https://gerrit.wikimedia.org/r/c/integration/config/+/954236/

We can try again on Monday morning :)

hashar renamed this task from [ci,operations-puppet] tox does not detect changes inside requirement files to [ci,operations-puppet] upgrade to tox 4 in order to detect changed requirement files.Sep 1 2023, 3:13 PM

I think we also need to consider how we can support local testing. currently in bookworm the latest version of tox is 3.28.0

operations/puppet does not have a [tox:jenkins] section in its tox.ini and is thus not blocked by T345607.

I have filed a task to migrate CI / repos etc to tox v4: T345695