Right now, when an user connects to a webservice running in Toolforge kubernetes, this happens:
client --> tools front proxy --> haproxy --> random k8s worker node --> ingress pod on a random worker node --> tool webservice on a random worker node
There is an extra overhead in the haproxy --> k8s worker node --> ingress pod step, because haproxy doesn't know in which node is the ingress pod running, we use a nodePort and let the ingress listen in every node of the cluster.
As of this writing, we have about 55 k8s worker nodes and only 3 ingress pods. The chances that haproxy hits a node with ingress running is pretty low, thus requiring another internal kubernetes forward to the correct node with a running ingress pod.
One simple way to solve this is to create ingress-specific nodes, nodes in which we only run nginx-ingress (plus related monitoring), and configure haproxy to redirect to those nodes only, instead of to all worker nodes. As a side effect, the nginx-ingress pods would be more relaxed from memory pressure (they use at least 1Gb memory, and growing).
Docs: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Networking_and_ingress