Tool Name: qrank
Type of quota increase requested: CPU
Reason: QRank processes pageviews and Wikimedia dumps to compute a ranking of Wikidata entities. For a quick intro, see README; for details and background, see Technical Design Document. The build pipeline is written in a compiled language (Go) and has been optimized for multi-core machines. When running the build pipeline on Digital Ocean, it finishes within a few hours. On Toolforge/Kubernetes, however, the same task currently takes almost three days. Partially this is due to NFS throttling, but according to my profiling, CPU seems to currently be the bigger bottleneck than read throughput.
Amount of quota requested: 8 CPUs would be ideal; only one pod will be needed. But if that’s too much, the tool can also live with less resources; it will adapt to whatever is available. On Digital Ocean, I’ve used 2 GiB of RAM per CPU core, but the system can also work with less memory if necessary. (The system heavily uses external sorting), making use of temporary files when running out of RAM). In the worst case, I can also live with the current default quota. But then, the freshness of the rankings will suffer.