On 2019-07-24 timeouts were reported by fsero to us from the codfw pods.
They all appeared to be triggered due to the automatic healthcheck requests.
We hoped that deploying new images to codfw would fix the problem. It did not an we saw intermittent timeouts from all 4 codfw pods in the 1 hour after deployment.
The content of the errors wasn't super helpful because the request key of the AxiosError was empty when the request fails on the network level (T228885 )
Now we have more detailed logging we can see the fail requests. The majority remain from codfw and all of those are to the SpecialEntityData endpoint