We have observed a low rate of memcached errors in production, eg 61/3hrs, which may or may not be worth investigating. While the rate is *very* low, we could dig a little bit further just in case there is an underlying problem or mcrouter behaviour we should be aware of or mitigate.
zooming in a bit, we get:
6 Aug 2024 - 08:33-08:48
Rack: B5
Host: parse2006
Pod: mcrouter-main-lcqjh
Container: mcrouter-main
Notes: Resources wise the pod looks alright, the host itself is experiencing TCP retransmits. https://grafana.wikimedia.org/goto/iDpzTPrIg?orgId=1
I will keep adding some data, and check if there is some sort of a pattern here



