In parent task we were discussing high load alerts for Toolforge k8s cluster, one of these is % memory requests vs total memory allocatable: going over 100% means new pods can't be scheduled.
I did an audit of tools based on k8s namespace memory here: https://w.wiki/Jzk9 with this query:
(
label_replace(
sum by(container_label_io_kubernetes_pod_namespace) (
container_memory_working_set_bytes{container_label_io_cri_containerd_kind!="sandbox"}
),
"namespace", "$1", "container_label_io_kubernetes_pod_namespace", "(.*)"
)
)
/ on(namespace)
sum by(namespace) (
kube_pod_container_resource_requests{resource="memory"}
)
* 100Today's results are at P89886. I then did a cumulative frequency distribution graph for the data:
Of immediate note the fact that 50% of namespaces/tools use less than 20%, i.e. we could be reducing their requests by 4-5x
Overview
From the audit about what tools are running with default values (P89954) it seems most webservice tools are not overriding default values. I have decided to focus on those first as the lowest hanging fruit. The plan is to go from 256MB requests to 128MB first, then assess the situation both in overall cluster behaviour and invidual tool. Assuming all goes well, we'll move to requesting 64MB by default, while leaving limits untouched.
For reference: over the last 7d (apr 17-24) about 2200 tool containers never went above 64MB working set size (container_memory_working_set_bytes) while about 1300 went over 64MB, about 700 over 128MB and about 340 over 256MB
Deployment
I was thinking of the following deployment plan:
- New package is available
- Install in toolsbeta, launch a webservice and verify memory requests
- Deploy to tools
- Restart a sample/chosen webservice with defaults, verify it comes back with adjusted memory request
- Roll-restart webservices using default requests listed in P89954 (i.e. at the bottom), using the script in P91435 with --cutoff-timestamp set to the start of this work
- Keep jobs monitored for restarts/oomkills (kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}) and kubelet evicting pods (increase(kubelet_evictions[5m])) via this dashboard https://grafana.wmcloud.org/d/fir9gwd/filippo-global-tools-stats



