While working on T277741 I noticed that linkrecommendation pods where switching a lot from fully ready to partially ready (2/3 containers only ready). Looking at events and pods I see the following
$kubectl describe pod/linkrecommendation-production-54968bf99-r5sqp
Warning Unhealthy 83s (x120 over 30m) kubelet, kubernetes1008.eqiad.wmnet Readiness probe failed: Get http://10.64.68.207:8000/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
So for some reason /healthz failes every now and then. Still not clear why, filing this task to investigate more later on.