As a followup of T322360, it was discovered that etcd/conf* hosts had a very high load, requesting configuration refreshes more often than it should. Causing very high load on the hosts:
This was only eventually discovered because the hosts run out of disk space due to access log spam. Ideally we could have caught this earlier by checking abnormal throughput, e.g. surpasing a limit, either globally of by host/service and alerting on it, so if it happens again it is detected (almost) immediately.