During an incident on 2020-06-08 we observed the gutter pool taking over for a memcached server under stress, but we also saw flapping between the server and the gutter pool server when the server recovered and then was hit again. The flapping causes disruptions in the service and we should investigate how to minimize the flapping.
In the 2020-06-08 incident Giuseppe stopped the flapping by firewalling the affected memcached server mc1029 for the duration of the incident.
memcache perfromance: https://grafana.wikimedia.org/d/000000316/memcache?orgId=1&from=1591588800000&to=1591592399000
Mcrouter: https://grafana.wikimedia.org/d/000000549/mcrouter?orgId=1&from=1591588800000&to=1591592399000