Currently our thumbnail generation aims to generate images reasonably fast, which doesn't yield the best compression ratios possible, particularly for PNG.
Tools like pngcrush try many different ways of compressing a PNG until they find the smallest one. We can't afford to run such a tool during the original thumbnail generation.
Ideally such a service would focus on the most heavily trafficked thumbnails.
- Listen to varnishlogs to identify PNGs with high hit counts. Ideally there should be a way to know that one has already been optimized (header in the cached object?)
- When one is found, avoid processing it multiple times ( in-process memory? memcache?)
- Add it to a queue (kafka?) to avoid spikes, since this is a low priority service that shouldn't have spikes
- Consume requests coming from the queue, map thumbnail URL to Swift object.
- Download object from Swift
- Run pngcrush on it
- Save result in Swift, with extra header showing that it's been optimized
- Purge thumbnail in Varnish (only there)