Page MenuHomePhabricator

Blubber should support python/tox
Closed, ResolvedPublic

Description

Blubber currently only supports building NodeJS/NPM Dockerfiles, but should have support for python/tox.

Revisions and Commits

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 5 2018, 6:14 PM
dduvall claimed this task.Feb 7 2018, 6:38 PM
dduvall triaged this task as Medium priority.
dduvall moved this task from Backlog to Doing on the Release Pipeline (Blubber) board.
dduvall moved this task from Backlog to In-progress on the Release-Engineering-Team (Kanban) board.

I'm not exactly sure how we can implement the same cache efficiency for Python project dependencies as we did for Node projects. Blubber would have to either:

  1. Support parsing of tox ini or pip requirement/constraint files to determine files containing requirements (they are highly configurable and potentially indirect using -r other.txt within a requirements file)
  2. Provide additional config fields to help Blubber know which files contain requirements and thus how to construct the optimized COPY instructions

@thcipriani, @hashar any thoughts on this?

For an example of what I mean, look at scap's Dockerfile.ci.

random thoughts / brain dump

tox defines several virtual envs and commands to be executed within them. Each such configuration is run serially. The dependencies are installed via pip based on a deps statement in each of the virtualenv or sometime via python setup.py which would define the dependencies inside the setup.py file.. Most of the time we use -r requirements.txt.

tox invokes pip and passes it whatever is in deps, pip downloads the tarball. At least for packages that requires a binary compilation, pip generates a wheel (a binary package). This way next time the same dependency is asked, pip does not have to download from pypi and it does not have to recompile a previously compiled module. They are all saved under:

  • XDG_CACHE_CONFIG/pip/http
  • XDG_CACHE_CONFIG/pip/wheels

Side note: the cache keeps growing as far as I understand. For bundler we went with bundle install --clean which automagically garbage collects installed gems that are no more in Gemfile.

The virtual envs are generated in ./.tox. When tox is run again, it would check whether deps has changed and if so rebuild the environment. However:

  • tox is not smart enough to detect a change happened in file mentioned by -r. Another way to state it is: if we were to save the tox virtualenv, when one change requirements.txt tox will not notice it.
  • the dependencies are typically unbound (eg: flake8) with no lockfiles (pip freeze) so the versions that should be installed depends on the state on pypi.

The way I did the cache on Nodepool and Docker is to just save XDG_CACHE_CONFIG and start running tox from a fresh checkout (ie without a .tox). There is the overhead of reinstalling everything, but given wheels are cached that is reasonably fast. One drawback: the cache keeps expanding :-(

Twist: one can create all the environments without running the test commands by using: tox --notest. Potentially that could be used as a baseline for caching, but then it would not be updated when running tox later :(

Tox not catching up with change in requirements.txt is https://github.com/tox-dev/tox/issues/149