I gave a shot at using Squid as a man in the middle proxy to cache https requests made by package managers. Following the CI weekly checkin on 2015-10-20 that needs to be described.
Overview
The idea is to have package managers to use a central proxy that handles both HTTP and HTTPS.
When using a proxy, the client does a CONNECT to have the proxy establish a direct connection to the server and establish a bridge between the remote server and the client. Hence the traffic is encrypted and can not be cached.
With Squid 3.3-3.4, we can use a feature known as Ssl Bump Server First http://wiki.squid-cache.org/Features/BumpSslServerFirst . The connection is terminated by Squid, which does query the remote server, generates a certificate on the fly signed with a local CA and serves that back to the client.
Since Squid acts as a man in the middle, it can cache materials properly.
Teaser with curl
I gave it a try on an instance named pmcache.integration.eqiad.wmflabs. Example
curl --verbose --cacert /etc/ssl/localcerts/integration.crt \ --proxy https://pmcache.integration.eqiad.wmflabs:8081/ \ https://www.wikipedia.org/
The client connect as usual with a CONNECT:
* Connected to pmcache.integration.eqiad.wmflabs (10.68.22.133) port 8081 (#0) * Establish HTTP proxy tunnel to www.wikipedia.org:443 > CONNECT www.wikipedia.org:443 HTTP/1.1 > Host: www.wikipedia.org:443 > User-Agent: curl/7.38.0 > Proxy-Connection: Keep-Alive > < HTTP/1.1 200 Connection established < * Proxy replied OK to CONNECT request
It switches to SSL and curl shows the server certificate which has been signed by our customer certificate:
* SSL connection using TLSv1.2 / AES256-GCM-SHA384 * Server certificate: * subject: C=US; ST=California; L=San Francisco; O=Wikimedia Foundation, Inc.; CN=*.wikipedia.org * start date: 2015-06-23 18:37:07 GMT * expire date: 2017-02-19 12:00:00 GMT * subjectAltName: www.wikipedia.org matched vvvvvvvvvvvvv MAN IN THE MIDDLEvvvvvvvvvvvvvvvvvvvvvvvv * issuer: C=US; ST=California; L=San Francisco; O=Wikimedia Foundation Inc.; OU=Release Engineering; CN=pmcache.integration.eqiad.wmflabs ^^^^^^^^^^^^ MAN IN THE MIDDLE ^^^^^^^^^^^^^^^^^^^ * SSL certificate verify ok. * SSLv2, Unknown (23):
Rest process as usual:
> GET / HTTP/1.1 > Host: www.wikipedia.org * SSLv2, Unknown (23): { [data not shown] < HTTP/1.1 200 OK ... < X-Cache: HIT from pmcache < X-Cache-Lookup: HIT from pmcache:8080
Dirty hands
The instance is a Debian Jessie distribution, due to copyright issue Squid can not be legally linked to OpenSSL and hence Squid is shipped without any SSL support. Squid 3.5 (not in Jessie) might supports GNUTLS.
So I have rebuild the Jessie package with support for OpenSSL:
apt-get install -y build-essential fakeroot libssl-dev openssl devscripts apt-get source -y squid3 apt-get build-dep -y squid3
Edit debian/rules and add to DEB_CONFIGURE_EXTRA_FLAGS:
--enable-ssl \ --with-open-ssl="/etc/ssl/openssl.cnf"
Rebuild and install:
debuild -us -uc dpkg -i squid3-common_3.4.8-6+deb8u1_all.deb squid3_3.4.8-6+deb8u1_amd64.deb squidclient_3.4.8-6+deb8u1_amd64.deb
SSL cert
Something like:
echo -e "US\nCalifornia\nSan Francisco\nWikimedia Foundation Inc.\nRelease Engineering\n`hostname --fqdn`\n\n" \ | openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout integration.key -out integration.crt
Copied both .key and .crt to /etc/ssl/localcerts but maybe they should be copied to /etc/ssl/certs/ to be looked up automatically.
Squid Configuration
Archived at P2211:
# SSL Bumping # ----------- # Allow bumping of the connection. Establish a secure connection with the # server first, then establish a secure connection with the client, using a # mimicked server certificate. Works with both CONNECT requests and intercepted # SSL connections. # # http://www.squid-cache.org/Versions/v3/3.4/cfgman/ssl_bump.html ssl_bump server-first all debug_options ALL,1 http_port 8080 http_port 8081 ssl-bump cert=/etc/ssl/localcerts/integration.crt key=/etc/ssl/localcerts/integration.key generate-host-certificates=on always_direct allow all # LOGGING # ------- access_log /var/log/squid3/access.log squid cache_log /var/log/squid3/cache.log logfile_rotate 5 log_mime_hdrs on strip_query_terms off # Log query terms # CLIENT # ------ request_header_max_size 8 KB request_body_max_size 8 KB reply_body_max_size 200 MB all read_ahead_gap 1024 KB quick_abort_min 0 KB quick_abort_max 0 KB quick_abort_pct 100 # MEMORY CACHE # ------------ cache_mem 1 GB maximum_object_size_in_memory 4096 KB memory_replacement_policy heap GDSF # Greedy-Dual Size Frequency # DISK CACHE # ---------- # Least Frequently Used with Dynamic Aging cache_replacement_policy heap LFUDA maximum_object_size 32 MB cache_dir aufs /srv/squid3/cache 10000 16 256 # 'Hide' ourself via off forwarded_for off follow_x_forwarded_for deny all # ACCESS LISTS # ------------ acl SSL_ports port 443 acl Safe_ports port 21 80 443 acl method_purge method PURGE acl method_connect method CONNECT # RULES # ----- http_access allow manager localhost http_access deny manager http_access allow method_purge localhost http_access deny method_purge http_access deny !Safe_ports http_access deny method_connect !SSL_ports #http_access deny !method_connect SSL_ports http_access deny to_localhost http_reply_access allow all icp_access deny all `
You end up with an usual proxy on port 8080 and one that does SSL bumping on 8081.
Usage with package managers
More or less in common:
- http_proxy=. to avoid hitting HTTP
- https_proxy=https://hostname --fqdn`:8081
- Can probably just point to http instead.
- Get them to trust our CA via the public key /etc/ssl/localcerts/integration.crt
To verify what hits the cache, the best way is to tail /var/log/squid3/access.log.
npm
rm -fR ~/.npm; http_proxy=. https_proxy=http://`hostname --fqdn`:8081 npm --verbose --cafile /etc/ssl/localcerts/integration.crt install grunt-cli
Squid:
TCP_MISS/200 22154 GET https://registry.npmjs.org/grunt-cli TCP_MEM_HIT/200 5152 GET https://registry.npmjs.org/grunt-cli/-/grunt-cli-0.1.13.tgz ^^^ served from Squid memory
Gotcha: https_proxy needs a http:// URL.
pip
rm -fR ~/.cache/pip; https_proxy=https://`hostname --fqdn`:8081 pip install --cert /etc/ssl/localcerts/integration.crt --target . PyYAML
Squid:
TCP_MISS/200 1872 GET https://pypi.python.org/simple/pyyaml/ TCP_MEM_HIT/200 249337 GET https://pypi.python.org/packages/source/P/PyYAML/PyYAML-3.11.tar.gz ^^^ served from Squid memory
bundler
I tested it with mediawiki/selenium.git , could not get it to works with a cert in localcerts such as:
SSL_CERT_FILE=/etc/ssl/localcerts/integration.crt http_proxy=. https_proxy=https://`hostname --fqdn`:8081 bundle install --verbose --path vendor/bundle Network error while fetching https://rubygems.org/quick/Marshal.4.8/rubocop-0.29.1.gemspec.rz
With curl:
http_proxy=. https_proxy=https://`hostname --fqdn`:8081 curl --verbose --cacert /etc/ssl/localcerts/integration.crt https://rubygems.org/quick/Marshal.4.8/rubocop-0.29.1.gemspec.rz < Location: https://rubygems.global.ssl.fastly.net/quick/Marshal.4.8/rubocop-0.29.1.gemspec.rz
If we follow redirect (curl -L) that shows a hit from the Squid cache.
I have tried various combination but I am hitting a wall here.
composer
Not covered.