Toolforge: k8s: ingress: consider creating ingress-specific nodes
Closed, DeclinedPublic
Actions

Assigned To

Authored By

	aborrero
	Apr 14 2020, 1:25 PM

Description

Right now, when an user connects to a webservice running in Toolforge kubernetes, this happens:

client --> tools front proxy --> haproxy --> random k8s worker node --> ingress pod on a random worker node --> tool webservice on a random worker node

There is an extra overhead in the haproxy --> k8s worker node --> ingress pod step, because haproxy doesn't know in which node is the ingress pod running, we use a nodePort and let the ingress listen in every node of the cluster.

As of this writing, we have about 55 k8s worker nodes and only 3 ingress pods. The chances that haproxy hits a node with ingress running is pretty low, thus requiring another internal kubernetes forward to the correct node with a running ingress pod.

One simple way to solve this is to create ingress-specific nodes, nodes in which we only run nginx-ingress (plus related monitoring), and configure haproxy to redirect to those nodes only, instead of to all worker nodes. As a side effect, the nginx-ingress pods would be more relaxed from memory pressure (they use at least 1Gb memory, and growing).

NOTE: we didn't detect any performance impact of this setup, is just a possible improvement. Actually, we decided against this solution when we originally developed the ingress, but may worth revisiting now the usage is growing.

Docs: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Networking_and_ingress

Details

Other Assignee: rook

	Subject	Repo	Branch	Lines +/-
	toolforge: k8s: run nginx-ingress on ingress dedicated nodes	operations/puppet	production	+18 -2
	kubeadm: rename hiera key for ingress nodes	operations/puppet	production	+4 -4

Customize query in gerrit

Related Objects

Mentioned In: T195217: Simplify ingress methods for PAWS

Event Timeline

aborrero created this task.Apr 14 2020, 1:25 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 14 2020, 1:25 PM

aborrero triaged this task as Lowest priority.Apr 14 2020, 1:26 PM

bd808 moved this task from Inbox to Needs discussion on the cloud-services-team (Kanban) board.Apr 15 2020, 10:52 PM

On the 2020-04-29 WMCS meeting we decided this is something interesting to explore + using openstack server groups to ensure ingress nodes aren't in the same hypervisor.
We also agreed this is a low priority change, so we wont work on this in the short term?

aborrero mentioned this in T195217: Simplify ingress methods for PAWS.Jun 2 2020, 12:51 PM

Change 604665 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] kubeadm: rename hiera key for ingress nodes

https://gerrit.wikimedia.org/r/604665

gerritbot added a project: Patch-For-Review.Jun 11 2020, 11:20 AM

Change 604665 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] kubeadm: rename hiera key for ingress nodes

https://gerrit.wikimedia.org/r/604665

Maintenance_bot removed a project: Patch-For-Review.Jun 11 2020, 12:11 PM

Raising priority, @Andrew mentioned that with the new domains for VMs we should try creating new k8s nodes and see how that works. This seems like the right test.

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T10:38:56Z] <arturo> created server group tools-ingress with soft anti affinity policy (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T10:42:01Z] <arturo> created VMs tools-k8s-ingress-1 and tools-k8s-ingress-2 in the tools-ingress server group T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T10:50:26Z] <arturo> created puppet prefix tools-k8s-ingress (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T11:12:21Z] <arturo> new ingress nodes added to the cluster, and tainted/labeled per the docs https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Deploying#ingress_nodes (T250172)

Change 626133 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: run nginx-ingress on ingress dedicated nodes

https://gerrit.wikimedia.org/r/626133

gerritbot added a project: Patch-For-Review.Sep 9 2020, 11:22 AM

mmm will do toolsbeta first just in case.

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T11:24:37Z] <arturo> created new puppet prefix toolsbeta-test-k8s-ingress (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T11:25:50Z] <arturo> created new server group toolsbeta-k8s-ingress (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-09T11:27:56Z] <arturo> created 2 VMs: toolsbeta-test-k8s-ingress-1 and toolsbeta-test-k8s-ingress-2 (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-10T08:59:49Z] <arturo> added toolsbeta-test-k8s-ingress-1 (and -2) to the k8s cluster (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-10T09:00:56Z] <arturo> tainted/labeld toolsbeta-test-k8s-ingress-1 (and -2) in the k8s cluster (T250172)

Mentioned in SAL (#wikimedia-cloud) [2020-09-10T09:15:12Z] <arturo> livehacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/626133 (T250172)

Change 626133 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: run nginx-ingress on ingress dedicated nodes

https://gerrit.wikimedia.org/r/626133

Mentioned in SAL (#wikimedia-cloud) [2020-09-10T10:22:13Z] <arturo> enabling ingress dedicated worker nodes in the k8s cluster (T250172)

done!

Maintenance_bot removed a project: Patch-For-Review.Sep 10 2020, 11:10 AM

I'm not sure that this works in the way that is expected. If I'm undestanding correctly what is hoped is that:
client --> tools front proxy --> haproxy --> random k8s worker node --> ingress pod on a random worker node --> tool webservice on a random worker node
Will become:
client --> tools front proxy --> haproxy --> random k8s ingress node/pod on that node --> tool webservice on a random worker node
Thus cutting out a network hop. But it is my understanding that we end up with:
client --> tools front proxy --> haproxy --> random k8s ingress node --> ingress pod on a random (1/3 the time the same) ingress node --> tool webservice on a random worker node

I was running a test that seems to confirm this, though if anyone wants to look at it with me that would be great.

I think the problem you are describing is 100% legit, specially nowadays, that we scaled up the number of ingress nodes/pods.

In summary:

this optimization doesn't actually optimize anything anymore.
on the last few k8s upgrades we did, it was mentioned that the extra handling that ingress nodes need is complex.

The only value I can think of right now is our ability to ensure that ingress pods always run on different cloudvirt hypervisors. I think we do soft-anti-affinity for k8s-worker VMs (because they outnumber cloudvirts) and hard anti-affinity for k8s-ingress.

I'm fine if we decide to drop/revert this setup and therefore simplify our upgrading process.

aborrero lowered the priority of this task from High to Medium.Oct 4 2021, 10:30 AM

aborrero updated Other Assignee, added: rook.

aborrero reassigned this task from aborrero to rook.Oct 4 2021, 1:21 PM

Cool, I'll dig in more on how we deploy the base VMs. But I believe you're right, and also believe that Brooke agrees, that the network traffic is enough to justify this setup. Thanks for the info.

fnegri edited projects, added cloud-services-team; removed cloud-services-team (Kanban).Jan 18 2023, 7:15 PM

fnegri moved this task from Kanban to Soon! on the cloud-services-team board.

If this is still desired it should be re-opened. Though I'm closing as magnum would invalidate this should tools go in that direction.

rook closed this task as Declined.Jan 19 2023, 1:44 PM

Toolforge: k8s: ingress: consider creating ingress-specific nodesClosed, DeclinedPublicActions

Description

Details

Related Objects

Event Timeline

Toolforge: k8s: ingress: consider creating ingress-specific nodes
Closed, DeclinedPublic
Actions