Page MenuHomePhabricator

toolforge: new k8s: evalute and test firewalling via calico
Open, MediumPublic

Description

We may want calico doing network firewalling rather than ferm due to ferm not playing well with other mechanisms modifying iptables rules in the system.

Details

Related Gerrit Patches:

Event Timeline

Bstorm triaged this task as Medium priority.Dec 5 2019, 9:16 PM
Bstorm added a comment.Dec 5 2019, 9:27 PM

So fun problem. Puppet is now at a level that stops it from doing file placements from remote sources. It will only do a file resource at file:// or puppet://, so we need to get calicoctl differently and at the same version as our calico. (https://gerrit.wikimedia.org/r/c/operations/puppet/+/554198)

One way is to use the remote container and deploy it to kubernetes and set aliases for root on the control plane hosts (sounds good and is consistent with what we've been doing). This is described well here: "Install calicoctl as a Kubernetes pod"
The other way is to build a deb. The build process is dockerized (https://github.com/projectcalico/calicoctl/) so we'd need to set up a modern version of docker on the builder host to do that. I'm not against this, but I'm leaning toward the k8s deploy.

Going the k8s deploy route, we could plop the yaml in puppet for the deployment and use puppet to add the alias.

Change 554969 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] toolforge-calico: Set up yaml and config to use calicoctl as a pod

https://gerrit.wikimedia.org/r/554969

@aborrero if you like the idea, there's a patch! This should also tie it in nicely to using puppet to manipulate the version of calico.
I checked that the OS-installed .bashrc does look for and include the .bash_aliases file, which makes this all very tidy.

Saw the patch, thanks!

Which kind of network policy do you have in mind? What would you drop, allow, etc?

Change 554969 merged by Bstorm:
[operations/puppet@production] toolforge-calico: Set up yaml and config to use calicoctl as a pod

https://gerrit.wikimedia.org/r/554969

root@toolsbeta-test-k8s-control-1:~# calicoctl version
Client Version:    v3.8.0
Git commit:        c84e9f21
Cluster Version:   v3.8.0
Cluster Type:      k8s,bgp,kdd

So that works. This should also allow some testing with network policy.

I hope to have some useful answer around what to actually DO with it today, lol.

There are several directions we can go with this. First, since we are using Calico, we are able to set global network policies that only apply to namespaces that are either labeled for tools or even begin with tool-. That allows us to be restrictive where it counts. We also left users with the ability to manipulate Kubernetes network policies.

This means it is possible to set a default deny global policy on all tool namespaces for all traffic to pods. Then, when a user launches a webservice, they open up port 8000 to the ingress controllers. This way the pods themselves are really heavily firewalled, but to users of webservice, there would be little-to-no difference. The default state is a flat network where all pods can communicate with all pods, which is not the worst condition to be in, but we may want to add a bit more structure with the appropriate tooling to make it less than horribly painful for our users. If a user sets a network policy, it should be noted, that automatically makes the namespace isolating. From then on out, you need a network policy to use the network.

If we define host endpoints for the nodes, the firewalling can also be managed through network policies, which would prevent random collisions with k8s since calico generally plays nice there, and the new cluster could use some firewalling. This should at least be straw-dogged out even if we don't end up doing much with GNPs.