Page MenuHomePhabricator

Evaluate angry-caching-proxy as a package managers cache
Closed, ResolvedPublic

Description

The node project https://www.npmjs.com/package/angry-caching-proxy claims to provide a transparent proxy for rubygems/pip/npm. Would gives us some level of caching for T112560: [tracking] Disposable VMs need a cache for package managers by caching the tarballs.

Event Timeline

hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)

I have created angry-caching-proxy in the integration project for that.

Applied puppet class role::labs::lvm::srv .

  1. apt-get install nodejs nodejs-legacy npm
  2. adduser --system --home /var/lib/angry-caching-proxy --disabled-login angry-caching-proxy
  3. mkdir -p /srv/angry-caching-proxy/cache
  4. chown -R angry-caching-proxy /srv/angry-caching-proxy

hashar$ sudo su --login --shell /bin/bash angry-caching-proxy
angry-caching-proxy$ npm install angry-caching-proxy
angry-caching-proxy$ node_modules/.bin/angry-caching-proxy --port 8080 --directory /srv/angry-caching-proxy/cache --triggers npm --triggers pypi --triggers rubygems

Can be configured via `/etc/angry-caching-proxy/config.json. And:

CONFIG { directory: '/srv/angry-caching-proxy/cache',
  customTriggers: '/etc/angry-caching-proxy/triggers.js',
  port: 8080,
  workers: 2,
  triggers: [ 'npm', 'pypi', 'rubygems' ],
  _: [],
  p: 8080,
  d: '/srv/angry-caching-proxy/cache',
  t: [ 'npm', 'pypi', 'rubygems' ],
  '$0': 'node ./node_modules/.bin/angry-caching-proxy',
  triggerFns: 
   [ [Function: isNodeModuleRequest],
     [Function: isPypiRequest],
     [Function: isRubyGemRequest] ] }
Started PID 11995
Proxy and cache view is at http://localhost:8080
Started PID 11996
Proxy and cache view is at http://localhost:8080
Using cache directory /srv/angry-caching-proxy/cache
Using cache directory /srv/angry-caching-proxy/cache

So in theory now it is all about setting http_proxy=http://10.68.19.184:8080.

Example with npm:

$ http_proxy=http://10.68.19.184:8080 npm install  --registry http://registry.npmjs.org/ jshint 
jshint@2.8.0 node_modules/jshint
├── strip-json-comments@1.0.4
├── exit@0.1.2
├── console-browserify@1.1.0 (date-now@0.1.4)
├── shelljs@0.3.0
├── minimatch@2.0.10 (brace-expansion@1.1.0)
├── cli@0.6.6 (glob@3.2.11)
├── htmlparser2@3.8.3 (domelementtype@1.3.0, entities@1.0.0, domhandler@2.3.0, readable-stream@1.1.13, domutils@1.5.1)
└── lodash@3.7.0

Server traces:

angry-caching-proxy@angry-caching-proxy:~$ node_modules/.bin/angry-caching-proxy --help
Start Argry Caching Proxy.

Usage: node ./node_modules/.bin/angry-caching-proxy

Options:
  --port           Port to listen                                                                  
  --directory, -d  Directory where to write cached files                                           
  --triggers       Triggers to activate. Can be defined multiple times.                            
  --workers        How many node.js processes to use as workes. Defaults to machine cpu core count.

angry-caching-proxy@angry-caching-proxy:~$ node_modules/.bin/angry-caching-proxy --port 8080 --directory /srv/angry-caching-proxy/cache --triggers npm --triggers pypi --triggers rubygems
CONFIG { directory: '/srv/angry-caching-proxy/cache',
  customTriggers: '/etc/angry-caching-proxy/triggers.js',
  port: 8080,
  workers: 2,
  triggers: [ 'npm', 'pypi', 'rubygems' ],
  _: [],
  p: 8080,
  d: '/srv/angry-caching-proxy/cache',
  t: [ 'npm', 'pypi', 'rubygems' ],
  '$0': 'node ./node_modules/.bin/angry-caching-proxy',
  triggerFns: 
   [ [Function: isNodeModuleRequest],
     [Function: isPypiRequest],
     [Function: isRubyGemRequest] ] }
Started PID 11995
Proxy and cache view is at http://localhost:8080
Started PID 11996
Proxy and cache view is at http://localhost:8080
Using cache directory /srv/angry-caching-proxy/cache
Using cache directory /srv/angry-caching-proxy/cache

Proxying GET http://registry.npmjs.org/jshint
Cache miss GET http://registry.npmjs.org/jshint/-/jshint-2.8.0.tgz
Cache CREATED in 60 ms for GET http://registry.npmjs.org/jshint/-/jshint-2.8.0.tgz a356343d3ecd2c6c64f68b85435e433f178718ad
Proxying GET http://registry.npmjs.org/cli
Proxying GET http://registry.npmjs.org/console-browserify
Proxying GET http://registry.npmjs.org/minimatch
Proxying GET http://registry.npmjs.org/exit
Proxying GET http://registry.npmjs.org/htmlparser2
Proxying GET http://registry.npmjs.org/shelljs
Proxying GET http://registry.npmjs.org/strip-json-comments
Proxying GET http://registry.npmjs.org/lodash
Cache miss GET http://registry.npmjs.org/console-browserify/-/console-browserify-1.1.0.tgz
Cache CREATED in 13 ms for GET http://registry.npmjs.org/console-browserify/-/console-browserify-1.1.0.tgz e9229d6990d19cee030f09b2c3026eed1ae12929
Cache miss GET http://registry.npmjs.org/cli/-/cli-0.6.6.tgz
Cache miss GET http://registry.npmjs.org/shelljs/-/shelljs-0.3.0.tgz
Cache miss GET http://registry.npmjs.org/htmlparser2/-/htmlparser2-3.8.3.tgz
Cache CREATED in 15 ms for GET http://registry.npmjs.org/cli/-/cli-0.6.6.tgz a23644bcea7994d8b402813df6d811b64ef1a18d
Cache miss GET http://registry.npmjs.org/exit/-/exit-0.1.2.tgz
Cache CREATED in 19 ms for GET http://registry.npmjs.org/shelljs/-/shelljs-0.3.0.tgz ae7990b0e43cd1e5a5bd96e08d4f99b1fb93ca2c
Cache miss GET http://registry.npmjs.org/minimatch/-/minimatch-2.0.10.tgz
Cache miss GET http://registry.npmjs.org/strip-json-comments/-/strip-json-comments-1.0.4.tgz
Cache miss GET http://registry.npmjs.org/lodash/-/lodash-3.7.0.tgz
Cache CREATED in 32 ms for GET http://registry.npmjs.org/htmlparser2/-/htmlparser2-3.8.3.tgz c1dabac327d9468a7c8e3c9357ebabed9f59f3f7
Cache CREATED in 30 ms for GET http://registry.npmjs.org/exit/-/exit-0.1.2.tgz 2e2d8f2767ad87d82c15829b0d7225633ec88d7b
Cache CREATED in 26 ms for GET http://registry.npmjs.org/strip-json-comments/-/strip-json-comments-1.0.4.tgz 2b303d58afdb68e05924f5a02759adeab3be0a7c
Cache CREATED in 31 ms for GET http://registry.npmjs.org/minimatch/-/minimatch-2.0.10.tgz 18f83dee9fb8212ee765a7ef71e39323ec19d8bb
Cache CREATED in 25 ms for GET http://registry.npmjs.org/lodash/-/lodash-3.7.0.tgz f7b56a9d40c572e5d0c25631c80690ad401013a2
Proxying GET http://registry.npmjs.org/date-now
Cache miss GET http://registry.npmjs.org/date-now/-/date-now-0.1.4.tgz
Cache CREATED in 20 ms for GET http://registry.npmjs.org/date-now/-/date-now-0.1.4.tgz 48838c4589ac57b3406e8e998dd7bd48cacbe713
Proxying GET http://registry.npmjs.org/brace-expansion
Cache miss GET http://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.0.tgz
Cache CREATED in 15 ms for GET http://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.0.tgz d32eac7c400fdfefccded42245e67f620062bcd7
Proxying GET http://registry.npmjs.org/glob
Cache miss GET http://registry.npmjs.org/glob/-/glob-3.2.11.tgz
Cache CREATED in 15 ms for GET http://registry.npmjs.org/glob/-/glob-3.2.11.tgz 1fc68b9cc88f631dd93f0690083519fb36cf9395
Proxying GET http://registry.npmjs.org/balanced-match
Proxying GET http://registry.npmjs.org/concat-map
Cache miss GET http://registry.npmjs.org/balanced-match/-/balanced-match-0.2.0.tgz
Cache miss GET http://registry.npmjs.org/concat-map/-/concat-map-0.0.1.tgz
Cache CREATED in 20 ms for GET http://registry.npmjs.org/concat-map/-/concat-map-0.0.1.tgz 554d6ca06bbbda67ee8df2c3d50ccf42a947b1dd
Cache CREATED in 24 ms for GET http://registry.npmjs.org/balanced-match/-/balanced-match-0.2.0.tgz 0bfd3a7e9a0ee6fc61fe612150c60f625800f8b6
Proxying GET http://registry.npmjs.org/inherits
Cache miss GET http://registry.npmjs.org/minimatch/-/minimatch-0.3.0.tgz
Cache CREATED in 14 ms for GET http://registry.npmjs.org/minimatch/-/minimatch-0.3.0.tgz 5b07b1508fe05e5b32ade718ec7d0985174e8a83
Cache miss GET http://registry.npmjs.org/inherits/-/inherits-2.0.1.tgz
Cache CREATED in 15 ms for GET http://registry.npmjs.org/inherits/-/inherits-2.0.1.tgz 15f765bc89f63b4b62bc57f9189ad49a05984b19
Proxying GET http://registry.npmjs.org/domhandler
Proxying GET http://registry.npmjs.org/domelementtype
Proxying GET http://registry.npmjs.org/domutils
Proxying GET http://registry.npmjs.org/entities
Proxying GET http://registry.npmjs.org/readable-stream
Cache miss GET http://registry.npmjs.org/domhandler/-/domhandler-2.3.0.tgz
Cache CREATED in 23 ms for GET http://registry.npmjs.org/domhandler/-/domhandler-2.3.0.tgz 85429560190f2a9801b327f50de0961d0ac2978f
Cache miss GET http://registry.npmjs.org/domutils/-/domutils-1.5.1.tgz
Cache miss GET http://registry.npmjs.org/domelementtype/-/domelementtype-1.3.0.tgz
Cache miss GET http://registry.npmjs.org/readable-stream/-/readable-stream-1.1.13.tgz
Cache miss GET http://registry.npmjs.org/entities/-/entities-1.0.0.tgz
Cache CREATED in 20 ms for GET http://registry.npmjs.org/domelementtype/-/domelementtype-1.3.0.tgz 4145c0aede76a971c6864a5905ad312677abec84
Cache CREATED in 24 ms for GET http://registry.npmjs.org/domutils/-/domutils-1.5.1.tgz 6a8e3896fec6a278d09e65b3b182c1b1b48aba72
Cache CREATED in 20 ms for GET http://registry.npmjs.org/entities/-/entities-1.0.0.tgz bf66c0f64ff735e66f6628aa87b16a96d0373a10
Cache CREATED in 30 ms for GET http://registry.npmjs.org/readable-stream/-/readable-stream-1.1.13.tgz 771e41c15c5667339e63979b996fc3fa877f1fa5
Proxying GET http://registry.npmjs.org/lru-cache
Proxying GET http://registry.npmjs.org/sigmund
Cache miss GET http://registry.npmjs.org/lru-cache/-/lru-cache-2.7.0.tgz
Cache miss GET http://registry.npmjs.org/sigmund/-/sigmund-1.0.1.tgz
Cache CREATED in 20 ms for GET http://registry.npmjs.org/lru-cache/-/lru-cache-2.7.0.tgz 27a2f986c4612863d0dae4f69832156586be1c7b
Cache CREATED in 18 ms for GET http://registry.npmjs.org/sigmund/-/sigmund-1.0.1.tgz 68e16624dd48006e6b3983ca584cb66cabb94dcc
Proxying GET http://registry.npmjs.org/dom-serializer
Cache miss GET http://registry.npmjs.org/dom-serializer/-/dom-serializer-0.1.0.tgz
Cache CREATED in 14 ms for GET http://registry.npmjs.org/dom-serializer/-/dom-serializer-0.1.0.tgz 8b15057ec6a5c85413357325862adf033812b738
Proxying GET http://registry.npmjs.org/string_decoder
Proxying GET http://registry.npmjs.org/core-util-is
Proxying GET http://registry.npmjs.org/isarray
Cache miss GET http://registry.npmjs.org/core-util-is/-/core-util-is-1.0.1.tgz
Cache miss GET http://registry.npmjs.org/string_decoder/-/string_decoder-0.10.31.tgz
Cache miss GET http://registry.npmjs.org/isarray/-/isarray-0.0.1.tgz
Cache CREATED in 13 ms for GET http://registry.npmjs.org/core-util-is/-/core-util-is-1.0.1.tgz 6501a73028510ce93950b4f0699618ed2523b7c6
Cache miss GET http://registry.npmjs.org/domelementtype/-/domelementtype-1.1.3.tgz
Cache CREATED in 27 ms for GET http://registry.npmjs.org/string_decoder/-/string_decoder-0.10.31.tgz 26c5df6ef3f0559e95288db3392622d59539e535
Cache CREATED in 23 ms for GET http://registry.npmjs.org/isarray/-/isarray-0.0.1.tgz d027c0743835cc5ce67ad6cf40d4d67da4c96006
Cache miss GET http://registry.npmjs.org/entities/-/entities-1.1.1.tgz
Cache CREATED in 20 ms for GET http://registry.npmjs.org/domelementtype/-/domelementtype-1.1.3.tgz 944ab26dafe9cc83ff327ec999b276327f84c9d0
Cache CREATED in 19 ms for GET http://registry.npmjs.org/entities/-/entities-1.1.1.tgz 4c1aabd4bbad60ad666de9b0d6c9df8417524bbe
`

On a second run:

Proxying GET http://registry.npmjs.org/jshint
Proxying GET http://registry.npmjs.org/cli
Proxying GET http://registry.npmjs.org/console-browserify
Proxying GET http://registry.npmjs.org/htmlparser2
Proxying GET http://registry.npmjs.org/exit
Proxying GET http://registry.npmjs.org/minimatch
Proxying GET http://registry.npmjs.org/shelljs
Proxying GET http://registry.npmjs.org/strip-json-comments
Proxying GET http://registry.npmjs.org/lodash
Proxying GET http://registry.npmjs.org/date-now
Proxying GET http://registry.npmjs.org/brace-expansion
Proxying GET http://registry.npmjs.org/glob
Proxying GET http://registry.npmjs.org/balanced-match
Proxying GET http://registry.npmjs.org/concat-map
Proxying GET http://registry.npmjs.org/inherits
Proxying GET http://registry.npmjs.org/domhandler
Proxying GET http://registry.npmjs.org/entities
Proxying GET http://registry.npmjs.org/domutils
Proxying GET http://registry.npmjs.org/domelementtype
Proxying GET http://registry.npmjs.org/readable-stream
Proxying GET http://registry.npmjs.org/lru-cache
Proxying GET http://registry.npmjs.org/sigmund
Proxying GET http://registry.npmjs.org/dom-serializer
Proxying GET http://registry.npmjs.org/core-util-is
Proxying GET http://registry.npmjs.org/isarray
Proxying GET http://registry.npmjs.org/string_decoder

Same with pip, need the http version hence:

$ http_proxy=http://10.68.19.184:8080  pip install --index-url http://pypi.python.org/simple PyYAML
Downloading/unpacking PyYAML
  http://pypi.python.org/simple/PyYAML/ uses an insecure transport scheme (http). Consider using https if pypi.python.org has it available
  Downloading PyYAML-3.11.tar.gz (248kB): 248kB downloaded

Server side:

Proxying GET http://pypi.python.org/simple/PyYAML/
Cache miss GET http://pypi.python.org/packages/source/P/PyYAML/PyYAML-3.11.tar.gz
Cache CREATED in 112 ms for GET http://pypi.python.org/packages/source/P/PyYAML/PyYAML-3.11.tar.gz cc81d2dd5b06597dc278d0d0a2266a9b8817740a

Second attempt:

Proxying GET http://pypi.python.org/simple/PyYAML/
Cache hit for GET http://pypi.python.org/packages/source/P/PyYAML/PyYAML-3.11.tar.gz cc81d2dd5b06597dc278d0d0a2266a9b8817740a

The cache is pretty simple, under /srv/angry-caching-proxy/cache each package is associated with two files:

  • <sha1> : the actual file
  • <sha1>.json : angry cache metadata

An example for PyYAML:

{
    "sha1": "cc81d2dd5b06597dc278d0d0a2266a9b8817740a",
    "method": "GET",
    "url": "http://pypi.python.org/packages/source/P/PyYAML/PyYAML-3.11.tar.gz",
    "created": "2015-09-14T19:53:12.676Z",
    "responseHeaders": {
        "last-modified": "Thu, 14 May 2015 18:56:19 GMT",
        "etag": "\"f50e08ef0fe55178479d3a618efe21db\"",
        "content-type": "application/octet-stream",
        "server": "AmazonS3",
        "via": "1.1 varnish",
        "cache-control": "max-age=31557600, public",
        "content-length": "248685",
        "accept-ranges": "bytes",
        "date": "Mon, 14 Sep 2015 19:53:12 GMT",
        "age": "1648180",
        "connection": "keep-alive",
        "x-served-by": "cache-sea1922-SEA, cache-jfk1031-JFK",
        "x-cache": "HIT, HIT",
        "x-cache-hits": "1, 2977",
        "x-timer": "S1442260392.665612,VS0,VE0"
    },
    "requestHeaders": {
        "host": "pypi.python.org",
        "accept-encoding": "gzip, deflate",
        "accept": "*/*",
        "user-agent": "pip/1.5.6 CPython/2.7.6 Linux/3.13.0-29-generic"
    }
}

I haven't checked, but gem probably comes with an HTTPS index nowadays.

So the soft works out of the box with no configuration needs and it is straightforward.

If we could find a way to use HTTPS that would be even better.

Maybe we can do that with Varnish/Nginx so we get support from ops? The VMs would use the HTTP index and we would rewrite URLs to point to HTTPS instead.

hashar claimed this task.

Evaluation done for now. See my comments in the task.