Page MenuHomePhabricator

Improve the network throughput for clouddb-services
Closed, ResolvedPublic

Description

I noticed that the traffic shaping config used in the labstore puppet class applies to all VMs, even if they are not on NFS.

This is not good for clouddb-services where Toolsdb (which has massive network needs) and smaller DB services run. They have no need for NFS, but they are still limited like they do.

Investigate either:

  • Removing the tc limits on the clouddb-services project altogether (probably more complex and possibly not desirable--though I'm not sure the tc limits were ever intended to be universally applied anyway)
  • Simply adding a special case for the limits for clouddb servers.

Event Timeline

Bstorm created this task.

Change 612958 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] clouddb: Uncap the network for the clouddb-services project

https://gerrit.wikimedia.org/r/612958

Change 612958 merged by Bstorm:
[operations/puppet@production] clouddb: Uncap the network for the clouddb-services project

https://gerrit.wikimedia.org/r/612958

Mentioned in SAL (#wikimedia-cloud) [2020-07-16T17:56:44Z] <bstorm> Significantly lifted traffic shaping limits on clouddb1001/toolsdb to improve network performance T257884

Note, the deployment of the tc shaping was very weird. It seems to depend on whether or not the nfsclient class ever tried to put NFS on something. The same goes for the puppet setting. It doesn't always do anything to the script. However, if the script has ever run on a VM (or tc directly), you will need to use it to make further changes.