MediaWiki communicates with the Memcached backends via a proxy (Mcrouter) local to the individual app server (not coordinated). Spurious errors from Memcached responses are sometimes interpreted by Mcrouter as being indicative of the backend server (or its network connection) being unhealthy. Sometimes that assumption is correct. Sometimes its not.
In any event, when this happens the individual backend is effectively depooled (in "TKO" mode) for a short time for requests from that particular MW server.
Given that a single logical Memcached key from MW translates to multiple real keys (these secondary "sister" keys store some metadata or interim values etc.), this means a single server being down, a much larger proportion of gets is effectively nulled out for all its reads.
To fix that, we'll change it so that the value key and sister keys route to the same Memcached shard.