Proton fetches article HTML from RESTBase, and converts it to PDF. Unlike our typical APIs, a request is pretty resource-intensive; well-developer articles take tens of seconds to render, abnormally huge pages will just time out and abort. If there is even a small spike in requests to the same page (e.g. some URL gets shared on social media) that will be pretty taxing. We should make sure we understand and document how Varnish behaves during such a spike (do requests get cached? do they get coalesced?), and fine-tune that behavior if needed.
Especially, do we want PDF responses (which will be several megabytes) to be cached? Normally, requests go through 2-3 layers of Varnish and get cached in all layers. We don't expect much traffic for any single URL, and latency in the single-second range or below does not really matter, so this would be a waste of space; maybe we should only cache them in backend Varnish (disk is cheaper then memory), or not at all.