Proposal: simplify set up of a new load-balanced service on kubernetes
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	Joe
	Nov 22 2019, 9:42 AM

Description

To deploy a new service in production on kubernetes right now there is a set of thing that need to be done. Marked as [SRE] or [service owner] in the list below

Group A: deployment of the service on kubernetes

set up the appropriate set of values in helmfiles in deployment-charts [service-owner]
set up the user token/credentials and other private data in the private puppet repository [SRE]
set up the corresponding namespace on all clusters [SRE]
deploy the service to all clusters [service-owner]

%Group B is all SRE actions.

Group B: Setting up LVS (load balancer)

Add the new service to every kube worker in conftool-data, discovery
Add the service IP to all the kube workers on loopback
Add the service to DNS records (both normal and discovery)
Fill in the service metadata to change the LVS configuration
Restart all the relevant LVS
Switch the monitoring of these endpoints to critical: true in puppet to add paging

While optimizing / eliminating any of these steps is nice, the biggest time-sink is without doubt the setup of LVS. It's long, complicated, failure prone, and only a handful of SREs are confident around the process.

How can we make that process better? In the reminder of this task I'll describe a few possible approaches.

Set up an ingress

This means allowing service owners to setup kubernetes resources that are specifically tailored to load-balancing and routing traffic incoming from the exterior of the cluster. It provides a pretty simple interface to manage externally. Unlike other proposals below, this is a L7 loadbalancer meaning it understands TLS, virtualhosts, HTTP etc.

Depending on where the implementation software is installed we have the following to paths. Both are valid approaches.

Inside kubernetes cluster

It means a request coming from the public would go as follows:

client => LB (pybal) => kube worker => kube-proxy (not a real hop, but does DNAT) => ingress => pods

Outside the kubernetes cluster

client => LB (pybal) => ingress => pods

This would add one node to the chain of proxying, and more moving parts. We would need to investigate Ingress solutions once more as we 've done in T170121, which was ~2,5 years ago. Things have changed since then.

How would Group B actions look like

Add a cname to the ingress.
Add some monitoring/alerting

That's it. A simple puppet patch and a simple dns patch. For very large services, maybe add a per-namespace setup.

Pros/cons

pros:

Integrated with kubernetes
industry "standard"
the rest of the infrastructure is left unmodified
L7 functionality.

cons

more moving parts
Ingress quality/stability at scale needs to be evaluated.
yet more complexity in our charts
Adding a potential SPOF (that is in some aspects addressable) as well as a potential chokepoint.
Essentially HTTP only. Specific implementations may support more protocols but overall the Ingress resources wasn't designed for this.

Modify pybal to autoconfigure pools from k8s

We already have a pybal patch ensuring we can fetch which workers are active from k8s instead of etcd, but we could expand it further to read /all/ data pybal needs from k8s, including pods. We would still need a consistent way to add IPs to the Load balancers and the k8s nodes, but that can be mostly done with some additional improvements [citation needed].

The flow of requests would be client => LB (pybal) => pod (via kube-proxy)

How would Group B actions look like

Add the dns record for the new service
Add the realserver IP on the kubernetes workers and the load balancers
Let pybal add the configuration when the service is properly annotated.
Add monitoring/alerting.

Three relatively simple patches (one to DNS, two to puppet). Some coordination is needed.

Pros/cons

pros:

no change to our current setup
Known unknowns. Pybal is mostly 'boring'
LVS-DR

cons:

Still an invented here solution
Not fully automated service addition, will still need to add IPs to the backends somehow.
Will need significant dev effort
Lack of L7 support

kube-proxy + bird

In this hypothesis, we'd have kube-proxy doing all the load-balancing, and announcing the LVS IPs via bird directly.

In this case, we'd have the simplest request flow:
client => kube-proxy => pod

In this hypothesis, we should configure some bgp daemon depending on which IPs we have configured on k8s, and run it as a sidekick of kube-proxy. One of the complications of this is that calico relies on running bird on each kubernetes nodes, so we 'd either have to setup kube-proxy+bird outside the cluster (mostly ending up resembling LVS), or we would need to figure out how to augment calico's bird configuration if we want to host it on the workers. This is essentially a variant of the pybal approach above.

How would Group B actions look like

Add the dns record for the new service
Add the realserver IP on the kubernetes workers (this can probably be automated by using annotations in the k8s api, but is it worth it?)
Add monitoring/alerting.

Three relatively simple patches (one to DNS, two to puppet).

Pros/cons

pros:

least hops for a request
No additional moving parts besides bird
Overall the simplest configuration

cons:

Unknown cost of working on a solid bgp announcement system.
Might need additional configuration to know which IPs to serve
Lack of L7 support
No LVS-DR

Refactor all the setup of LVS across dns and puppet

It's probably possible to simplify the steps to set up a load-balanced service by rationalizing the puppet code around it (for example, synchronizing systems across various stages, or allowing to add a new service all in a patch and not in 3 different ones).

pros

no new technology would be introduced in production

cons

No clear implementation idea
Might never achieve a fully streamlined solution

Details

Subject	Repo	Branch	Lines +/-
staging-codfw: Advertise service cluster IPs	operations/deployment-charts	master	+2 -0
staging-codfw: Enable masquarade_all	operations/puppet	production	+1 -0
Enable per flow ECMP for kubernetes/kubestage	operations/homer/public	master	+2 -0
Add kubernetes service IP ranges to prefix list	operations/homer/public	master	+32 -24

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Declined		None	T238909 Proposal: simplify set up of a new load-balanced service on kubernetes
		Resolved		ayounsi	T328338 Calico and BFD

Proposal: simplify set up of a new load-balanced service on kubernetesClosed, DeclinedPublicActions

Description

Set up an ingress

Inside kubernetes cluster

Outside the kubernetes cluster

How would Group B actions look like

Pros/cons

Modify pybal to autoconfigure pools from k8s

How would Group B actions look like

Pros/cons

kube-proxy + bird

How would Group B actions look like

Pros/cons

Refactor all the setup of LVS across dns and puppet

Details

Related ObjectsSearch...

Event Timeline

Proposal: simplify set up of a new load-balanced service on kubernetes
Closed, DeclinedPublic
Actions

Related Objects
Search...