Page MenuHomePhabricator

Implement a pod networking policy approach
Closed, ResolvedPublic

Description

We would like to establish a clear, consice and easy to find networking policy for pods. This is the task for describing our options and choosing what we implement.

We could go forward with either a whitelist or a blacklist approach. For the rest of this task (and unless we radically change our approach) we assume we use a whitelist approach, that is we block everything that is not explicitly allowed.

Ideally this would

  • Disallow any outgoing connections from a pod the service owner does not explicitly allow
  • Disallow any incoming connection to a pod that the service owner does not explicitly allow

On a more pragmatic note, it probably makes sense to establish some defaults. For outgoing connections from the pods , those could be

  • statsd
  • graphite
  • logstash
  • url-downloader
  • our own API endpoints (action/REST api), with or without the caching layer involved.
  • EventBus

For incoming connections to the pods, those defaults could be

  • icinga
  • prometheus ?
  • LVS servers

The policy should be easy to find out both from the developers PoV as well as Ops PoV and should be reviewable. The operations/puppet repo sounds like a sane first attempt and we can always re-evaluate later on

Result: The outgoing connections are indeed filtered now, the incoming are really easy to do so on a per namespace basis by doing the following:

kubectl --kubeconfig kubeconfig.eqiad annotate ns <namespace> "net.beta.kubernetes.io/network-policy={\"ingress\":{\"isolation\":\"DefaultDeny\"}}"

Details

Related Gerrit Patches:
operations/puppet : productionShip the default egress policy
operations/calico-k8s-policy-controller : 0.6.0Support supplying a default egress policy
operations/calico-k8s-policy-controller : masterSupport supplying a default egress policy
operations/puppet : productionDocument in-datastore calico configuration

Event Timeline

Joe added a subscriber: Joe.
Volans added a subscriber: Volans.Jul 10 2017, 7:18 PM
mobrovac updated the task description. (Show Details)
mobrovac added a subscriber: mobrovac.
GWicke added a subscriber: GWicke.

The whitelisting & defaults described in the description generally make sense to me. This will be a big step forward from the status quo.

Pointers to previous discussion (and implicitly from there here):

Same. Having a default white-list that covers most-used cases makes sense too.

However, explicitly white-listing incoming connections might be tricky, as that means that if two services need to communicate, one needs to remember to white-list both of them. This might also prove tricky in the cases of MW and RB, which communicate with a lot of entities in the environment (I say that knowing that they will likely need special, custom firewall rules anyway).

bd808 added a subscriber: bd808.Jul 12 2017, 6:17 PM
mark added a subscriber: mark.Jul 18 2017, 2:00 PM

Change 376254 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Document in-datastore calico configuration

https://gerrit.wikimedia.org/r/376254

Change 376254 merged by Alexandros Kosiaris:
[operations/puppet@production] Document in-datastore calico configuration

https://gerrit.wikimedia.org/r/376254

Change 377419 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/calico-k8s-policy-controller@master] Support supplying a default egress policy

https://gerrit.wikimedia.org/r/377419

My first approach had been to just document these things in the puppet repo without even partially actually enforcing them. That being said, I 've re-evaluated the situation and with the change above we can partially at least enforce them. By partially I mean that the calico policy controller is updated to read from ConfigMaps the default egress network policy and apply it.

From here things become a little bit more obscure as Updates to the ConfigMap will make it to the policy controller and it will apply them (on container restarts in my understanding - needs testing), however a human has to update the ConfigMap whenever the config in the puppet repo is updated. I 'll try and figure out how to automate that last step.

Change 377421 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/calico-k8s-policy-controller@0.6.0] Support supplying a default egress policy

https://gerrit.wikimedia.org/r/377421

Change 377419 abandoned by Alexandros Kosiaris:
Support supplying a default egress policy

Reason:
Cherry-picked in 0.6.0

https://gerrit.wikimedia.org/r/377419

From here things become a little bit more obscure as Updates to the ConfigMap will make it to the policy controller and it will apply them (on container restarts in my understanding - needs testing), however a human has to update the ConfigMap whenever the config in the puppet repo is updated. I 'll try and figure out how to automate that last step.

After some consideration that part will be skipped for now and will be worked on next quarter as part of the authn/authz part of kubernetes since any kind of automated way should use the API and for that sane authentication and authorization should exist first.

Change 377470 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Ship the default egress policy

https://gerrit.wikimedia.org/r/377470

This is practically done. I 'll wait a bit for some review on the above commits before merging them. Those commits above implement network policy enforcement as per the agreed default egress policy. As far as the ingress policy goes I would rather leave it for now up to the service owner. They could allow anything or set a default deny and allow specific hosts reaching out to their service, depending on the needs.

I 've skipped graphite and the cached API endpoints for now for re-evaluation (I actually expect the cached API to be asked) but it's easy enough to add them if needed.

Overall I consider this resolved

akosiaris updated the task description. (Show Details)Sep 12 2017, 2:38 PM

Change 377421 merged by Alexandros Kosiaris:
[operations/calico-k8s-policy-controller@0.6.0] Support supplying a default egress policy

https://gerrit.wikimedia.org/r/377421

Change 377470 merged by Alexandros Kosiaris:
[operations/puppet@production] Ship the default egress policy

https://gerrit.wikimedia.org/r/377470

akosiaris closed this task as Resolved.Sep 14 2017, 12:04 PM
akosiaris claimed this task.

Changes merged, resolving