Page MenuHomePhabricator

Open the ceph throttle a bit for tools-k8s-etcd server
Closed, ResolvedPublic

Description

tools-k8s-etcd servers seem to ride at pretty high iowait (see https://grafana-labs.wikimedia.org/d/7fTGpvsWz/toolforge-vm-details?var-VM=tools-k8s-etcd-4) for examples. That graph smooths some spikes, but it shows that it never quite hits close to zero like it should.

These servers are also showing poor performance in general, and they are core to the functioning of Toolforge, so that could be related to some of the issues we are seeing now and then with service backends failing. Please increase the iops available to tools-k8s-etcd-4-6.

Event Timeline

Bstorm triaged this task as High priority.Nov 2 2020, 11:06 PM
Bstorm created this task.

those VMs are now using the flavor 'g2.cores1.ram2.disk20.4xiops'.

quota:disk_read_iops_sec='20000', quota:disk_total_bytes_sec='800000000', quota:disk_write_iops_sec='2000'
Andrew added a subscriber: Andrew.

iowait has improved, but there's still quite a bit.


We may need to lift the limits further. However, I recall non-zero iowait even when these were completely unlimited. For now, Kubernetes API responses are once again under a second, so I'll take it.