Swift appears to be the main cause for sub-part thumbnail performance, for images that fall out of varnish cache. In addition to this, we currently store 3 replicas of each Swift file, which is a considerable storage waste.
Ideally a replacement for thumbnail storage should have these properties:
- Its number one quality should be speed for pulling a file. It's all that matters, serving a file to varnish as fast a possible.
- It doesn't matter if it loses content, since we can always re-generate thumbnails. No replication required
Unlike what we've said before on that topic, I don't think that items should expire. Because the performance issue that we see the most that makes Swift responsible is files that are infrequently accessed. Should we apply an expiry to our thumbnail store, it would render it useless: by the time it falls out of varnish, it would be gone from thumbnail storage, making the store useless. The whole point of the thumbnail store should be to keep copies of infrequently accessed thumbnails (it could actually be smart and not store things that never fall out of varnish because they're accessed a ton) that would be costly to regenerate.
Another naive view could be that we might as well always render thumbnails on the fly and have varnish be our only caching layer. We could do that, but the original still has to be pulled form Swift (we need replication to avoid losing originals). Which for large originals can be pretty slow. In that situation too it seems like more time is usually spent pulling the original from swift and shipping it across the network than actually doing the image processing to generate the thumbnail (would have to be verified, but I recall surprisingly poor performance for pulling large files from Swift).