Page MenuHomePhabricator

[scap] Compile HHVM bytecode cache as deployment step
Closed, DeclinedPublic

Description

HHVM uses a persistent SQLite database as a bytecode cache. This cache will be dynamically maintained by the HHVM server process, but it can also be pre-populated to bootstrap a newly deployed codebase [0]. Pre-compiling the cache during scap and distributing that cache to the HHVM worker nodes should provide a significant performance increase.

[0]: http://hhvm.com/blog/4061/go-faster


Version: wmf-deployment
Severity: enhancement

Details

Reference
bz64272

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:11 AM
bzimport added a project: Deployments.
bzimport set Reference to bz64272.
bzimport added a subscriber: Unknown Object (MLST).
bd808 added a subscriber: ori.

I did some work previously during the HHVM project to make a proof of concept for the compilation step. I think @ori has looked at this issue some as well.

As I recall there were a couple of issues that would need to be worked out in addition to the basic cache file generation:

  1. The deployment staging host (tin) would need to have HHVM available and use the exact same version of HHVM as the MW servers in the cluster that would use the hhbc cache. HHVM versions the storage schema in the hhbc data so that older and newer runtimes can point to the same cache file without clobbering each other.
  2. The sqlite data files are binary blobs and as such will not rsync well. Scap could either invent yet another binary->text->binary processing step as we did with the l10n cache files, or we could figure out a better way to sync blobs. I spent a tiny amount of time looking at pure python bittorrent components as a possible solution to the "better blob transfer" approach. Herd and Horde both looked promissing but I didn't go all the way to a POC implementation with either.

I did some work previously during the HHVM project to make a proof of concept for the compilation step.

Prior POC code now published at https://github.com/bd808/bug-67168

@thcipriani and I did some testing of herd/horde, it does work fairly well. It's just a simple wrapper around bittornado and it uses ssh to coordinate the whole swarm. It's definitely workable though.

This: https://github.com/TMG-nl/p2ptracker

The tracker uses the knowledge we have about our network topology to build two-tier swarms of bittorrent clients. It will return peers in a
global swarm to the first two clients in a rack requesting tracker information. Any additional clients from the SAME rack requesting peer
information from the tracker will only get peers in the same rack, thus building a second tier swarm that spans a single rack.

This setup was chosen because the uplink bandwidth in a rack is a critical resource for us. If many clients in a rack start downloading
pieces from other peers randomly distributed in our network environment they may/will saturate the uplink in the rack, causing serious
starvation issues and failing requests to production services.

By limiting the amount of peers that participate in a global bittorrent swarm in a single rack to 2 and capping the bittorrent client to
~100mbit/s we can guarantee that the rack uplink is only utilized for 20% by bittorrent traffic.

I'm not convinced that this would be an overall win. Last week we had problems with a depleted byte code cache I checked monitoring after pruning the cache for effects of the new cache warmup and it wasn't noticable in grafana's server board. Maybe it's more visible to real performance tracing, but I guess the cache warmup period is fairly brief. OTOH it adds a lot of complexity and potential new error sources.