Creating this ticket as a parent to several tasks that have the same goal of improving the developer experience on stat hosts:
T373337 Create dashboards for stat servers
T373046 Create alerts for high resource utilization on the stat servers
T372941 Review I/O setup on stat1008
T372416 Implement cgroups for users' JupyterHub environments in order to mitigate resource contention on the stat servers
To be clear, this is scoped to SRE work: alerting/metrics/dashboards/performance etc.