25K IOPS on non-SSDs, 100% disk utilization: https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&from=1501923755489&to=1502357983184&var-server=labsdb1001&var-network=eth0
I recommend switching some preferred shards to labsdb1003 (slightly less loaded: https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&from=1501923755489&to=1502357983184&var-server=labsdb1003&var-network=eth0 ) or start using labsdb1009/10/11. Not even replication can keep up with the load.