As a user of superset I wish to experience faster dashboard rendering and fewer timeouts so that I can quickly view the reports.
The solution identified is to implement Presto's built-in Alluxio SDK as a discrete cache for HDFS files on each presto worker node.
An earlier iteration of this plan was attempted in 2021, where we had intended to use a distributed alluxio cache service. This failed because we were unable to connect Alluxio to a kerberised Hive metastore.
This version of the plan differs from that previous attempt in that Alluxio is only ever used locally on each presto worker node, using a jar file provided with presto itself.
The caches are unaware of each other and the only client of each cache is the presto server running on the same machine.