There is an overdue (>7 days) warning alert:
Kubelet exec_sync operations on ml-serve1001.eqiad.wmnet take 1.133s in p99:
https://alerts.wikimedia.org/?q=alertname%3DKubeletOperationalLatency&q=team%3Dsre&q=%40receiver%3Ddefault
Could you please fix the underlying cause or adjust the alert? Please also tag the alert with your team name if not already done.