Page MenuHomePhabricator

Blubber should support python/tox
Closed, ResolvedPublic

Description

Blubber currently only supports building NodeJS/NPM Dockerfiles, but should have support for python/tox.

Revisions and Commits

rGBLBR Blubber
Restricted Differential Revision

Event Timeline

dduvall triaged this task as Medium priority.
dduvall moved this task from Backlog to Doing on the Release Pipeline (Blubber) board.

I'm not exactly sure how we can implement the same cache efficiency for Python project dependencies as we did for Node projects. Blubber would have to either:

  1. Support parsing of tox ini or pip requirement/constraint files to determine files containing requirements (they are highly configurable and potentially indirect using -r other.txt within a requirements file)
  2. Provide additional config fields to help Blubber know which files contain requirements and thus how to construct the optimized COPY instructions

@thcipriani, @hashar any thoughts on this?

random thoughts / brain dump

tox defines several virtual envs and commands to be executed within them. Each such configuration is run serially. The dependencies are installed via pip based on a deps statement in each of the virtualenv or sometime via python setup.py which would define the dependencies inside the setup.py file.. Most of the time we use -r requirements.txt.

tox invokes pip and passes it whatever is in deps, pip downloads the tarball. At least for packages that requires a binary compilation, pip generates a wheel (a binary package). This way next time the same dependency is asked, pip does not have to download from pypi and it does not have to recompile a previously compiled module. They are all saved under:

  • XDG_CACHE_CONFIG/pip/http
  • XDG_CACHE_CONFIG/pip/wheels

Side note: the cache keeps growing as far as I understand. For bundler we went with bundle install --clean which automagically garbage collects installed gems that are no more in Gemfile.

The virtual envs are generated in ./.tox. When tox is run again, it would check whether deps has changed and if so rebuild the environment. However:

  • tox is not smart enough to detect a change happened in file mentioned by -r. Another way to state it is: if we were to save the tox virtualenv, when one change requirements.txt tox will not notice it.
  • the dependencies are typically unbound (eg: flake8) with no lockfiles (pip freeze) so the versions that should be installed depends on the state on pypi.

The way I did the cache on Nodepool and Docker is to just save XDG_CACHE_CONFIG and start running tox from a fresh checkout (ie without a .tox). There is the overhead of reinstalling everything, but given wheels are cached that is reasonably fast. One drawback: the cache keeps expanding :-(

Twist: one can create all the environments without running the test commands by using: tox --notest. Potentially that could be used as a baseline for caching, but then it would not be updated when running tox later :(

dduvall added a revision: Restricted Differential Revision.Feb 14 2018, 6:50 PM