Blubber currently only supports building NodeJS/NPM Dockerfiles, but should have support for python/tox.
I'm not exactly sure how we can implement the same cache efficiency for Python project dependencies as we did for Node projects. Blubber would have to either:
- Support parsing of tox ini or pip requirement/constraint files to determine files containing requirements (they are highly configurable and potentially indirect using -r other.txt within a requirements file)
- Provide additional config fields to help Blubber know which files contain requirements and thus how to construct the optimized COPY instructions
random thoughts / brain dump
tox defines several virtual envs and commands to be executed within them. Each such configuration is run serially. The dependencies are installed via pip based on a deps statement in each of the virtualenv or sometime via python setup.py which would define the dependencies inside the setup.py file.. Most of the time we use -r requirements.txt.
tox invokes pip and passes it whatever is in deps, pip downloads the tarball. At least for packages that requires a binary compilation, pip generates a wheel (a binary package). This way next time the same dependency is asked, pip does not have to download from pypi and it does not have to recompile a previously compiled module. They are all saved under:
Side note: the cache keeps growing as far as I understand. For bundler we went with bundle install --clean which automagically garbage collects installed gems that are no more in Gemfile.
The virtual envs are generated in ./.tox. When tox is run again, it would check whether deps has changed and if so rebuild the environment. However:
- tox is not smart enough to detect a change happened in file mentioned by -r. Another way to state it is: if we were to save the tox virtualenv, when one change requirements.txt tox will not notice it.
- the dependencies are typically unbound (eg: flake8) with no lockfiles (pip freeze) so the versions that should be installed depends on the state on pypi.
The way I did the cache on Nodepool and Docker is to just save XDG_CACHE_CONFIG and start running tox from a fresh checkout (ie without a .tox). There is the overhead of reinstalling everything, but given wheels are cached that is reasonably fast. One drawback: the cache keeps expanding :-(
Twist: one can create all the environments without running the test commands by using: tox --notest. Potentially that could be used as a baseline for caching, but then it would not be updated when running tox later :(