Page MenuHomePhabricator

Implement authentication/authorization in Kubernetes clusters
Closed, ResolvedPublic

Description

Statement

Starting with kubernetes version 1.6, kubernetes support RBAC [1]. We are currently in version 1.7 in production and have been eagerly awaiting for this in order to implement a sane authentication and authorization model for our production clusters.

This task is about figuring out the best possible way to implement authentication and authorization in kubernetes. It will probably also lead to answering a few other questions like how do we expect to organize our services, how we will use namespaces etc

Results

Introduction

Kubernetes supports practically various 3 distinct ways for authenticating users. That is:

  • Either every request needs to have a bearer token in the headers in the form Authorization: Bearer 31ada4fd-adec-460c-809a-9e56ceb75269
  • Either It's Client Certificate authentication.
  • Either It's Basic HTTP auth

For the former It's up to the authenticating method to decide whether that token is authenticated and the attributes/groups the authorizer will decide permissions upon. For the latter it's up to the CA who encodes the perms in the subject field of the certificate

Authentication

Available authenticators

TypePlausiblity
X509 Client CertsNO
Static Token FileYes
Bootstrap TokensNO
Static Password FileNO
Service Account TokensYes
OpenID Connect TokensNO
Webhook Token AuthenticationYes
Authenticating ProxyYes
Keystone PasswordYes, labs only
Port 8080Only used internally to the cluster

Let's examine them in a bit more detail

X509 Client certs

For this to be practical in any form, a CA needs to be used and managed by kubernetes itself. While kubernetes does have CSR signing capablities, reimplementing a CA when we already have Puppet's, redistributing different certificates and keys and using them to authenticate users is an exercise in futility. Also the format is cumbersome as practically the authorization info to be passed into authorizer is encoded in the cert subject in the form

/CN=user/O=group1/O=group2

Static token file

We currently have that. It's practically a CSV file of the form

token,user,uid,"group1,group2,group3"

uid is supposed to be more unique than the username and it must be a number. It closely resembers the POSIX uid notion but clients almost never see it.

Clients authenticate by embedding an HTTP header

Authorization: Bearer <token>

This is valid way of going forward and does have a few pros/cons:

  • the apiserver needs to be restarted on every update of the file (generally that's fine)
  • groups are encoded in an error prone way
  • we do have in database permission encoding
  • We need a puppet change to enable a user (both good and bad)

Bootstrap tokens

Alpha feature, meant for bootstrapping cluster, it's not really meant to be used for anything else (and is actively discourage), does not support good authz info encoding, enough said.

Static password file

This is basic auth, even kubernetes docs discourage the use. The credentials are stored in a CSV file and the apiserver needs a restart every time we reboot it. The format is very close to the token file one

password,user,uid,"group1,group2,group3"

Finally clients authenticate with

Authorization: Basic BASE64ENCODED(USER:PASSWORD).

Let's avoid this

Service Account Tokens

A service account is an automatically enabled authenticator that uses signed bearer tokens to verify requests. Service accounts are usually created automatically by the API server and associated with pods running in the cluster through the ServiceAccount Admission Controller to allow pods to use the kubernetes API. That being said, it is possible to reuse them for user accounts. An example is

kubectl create serviceaccount jenkins

The secret gets automatically created by the controller manager (which signs practically the secret with it's private key) and needs to be retrieved from the API with

kubectl get secret jenkins-token-<random> -o yaml (and unbase64 it)

and then per above pass the following HTTP header

Authorization: Bearer <token>

The authenticator passes the following information to the authorizer

system:serviceaccount:(NAMESPACE):(SERVICEACCOUNT), 
system:serviceaccounts
system:serviceaccounts:(NAMESPACE)

which can prove useful (more on the authorization part of the doc)

OpenID Connect Tokens

There is little incentive for now to do this variant of OAuth2. We don't really have any identity providers, mediawiki only does Oauth 1.0, not Oauth2 per https://en.wikipedia.org/wiki/Special:Version and https://www.mediawiki.org/wiki/Extension:OpenID and https://www.mediawiki.org/wiki/Extension:OAuth2_Client are effectively consumers, not identity providers

The idea however is to get the id_token from your identity_provider and pass it as in the above cases in an Authorization header

Webhook token authentication

This is practically the API server calling a remote service (e.g. https://authn.example.com/authenticate and sends a TokenReview HTTP POST request. The remote service says yes/no and add some extra information like groups, uid, extra fields for the authorizers.

This is a valid approach for us, but we need to implement it from scratch. It will also be a disjoint thing from the kubernetes cluster that we will need to maintain.

Authenticating proxy

Get an nginx doing a reverse proxy to the apiserver, set a couple of headers and bam you are authenticated. Groups seem to work fine, but extra authorization info has however issues as setting the headers for that will need to happen dynamically which is not the best thing to do in an nginx config. Plus if we end up writing some code to do it we pretty much have a version of the above webhook solution

Keystone password

We don't have a keystone service in production, maybe labs could use it. It's however still experimental.

Anonymous

Disabled by default and for good reason

Port 8080

Just talk to it and bypass all authn/authz. Used by the controller manager and the scheduler. It is bound to localhost only and is blocked by ferm. We won't be using it in production for anything else than controllers

Authorization

Authenticated requests get passed along to authorizers. These are 5:

DefaultDeny

Not much to talk about, it's just for testing

DefaultAllow

Not much to talk about, pretty evident what it does

ABAC

ABAC is a rather simple authorization mode. You specify a policy file with ONE JSON object per line. A restart of the apiserver is required for changes to take effect. Format is

`
{"kind": "Policy", "apiVersion": "abac.authorization.kubernetes.io/v1beta1", "spec": {"user": "<user>", "group": "<group>", "apiGroup": "<apigroup>", "resource": "<resource>", "namespace": "<namespace>", "nonResourcePath": "<nonResourcepath>", "readonly":<true|false>}}`

The rules can be conceptualized in the form of a standard sentence

subject verb object

The verb is an HTTP verb (GET, POST, etc). It defines the action the subject wants to take on the object

The verdict is always implicit and it's allow. No deny exists. In the absence of a rule allowing an action, it is denied.

The subjects(who) of the action are:

  • <user> is the kubernetes user. Mandatory
  • <group> matches any of the groups assigned to the above user. Optional

The objects(on what) are:
<apigroup> matches an API groups like say "extensions". A * matches everything. Aside from the "core" group which is at /api/v1, the other are under /apis/<name>, so extensions, metrics.k8s.io are valid names. IMHO limiting this does not really add much usefulness since one is probably gonna be doing that at the resource level anyway.
<namespace> matches the namespace, pretty self explanatory
<resource> is the resource itself. We are mostly talking about pods/replicasets/deployments etc.
<nonResourcePath> is about this that are not resources. Which are these ? /version or /apis is an example. starts are allowed
readonly is a flag for only allowing readonly operations only (GET, WATCH, LIST)

RBAC

RBAC is a more complicated version of ABAC in essence. While it can be used in pretty much the same way as ABAC, it's more flexible and is stored in the cluster meaning no restart of apiserver is required for changes to take effect

The rules can be conceptualized in the form of a standard sentence

subject verb object

The verb is an HTTP verb (GET, POST, etc). It defines the action the subject wants to take on the object

The verdict is always implicit and it's allow. No deny exists. In the absence of a rule allowing an action, it is denied.

The subjects(who) of the action are:

  • <user> is the kubernetes user. Either this or <group> needs to be specified.
  • <group> matches any of the groups assigned to the above user. Either this or <user> needs to be specified

The object(to what) are:

  • apigroups: The core group is "", the rest are the ones listed under /apis/
  • resources: The resource itself. We are mostly talking about pods/replicasets/deployments etc. Subresources are allowed and sometimes useful (e.g. "pods/log")
  • resourceNames: In case we want to limit the above to specfic instances. Optional
  • nonResourceURLs: URLs we want to allow access to. Not namespaced. Useful for things like health checks. Optional

The above are codified via 2 main concepts:

  • Role
  • Rolebinding

and 2 anciliary ones

  • ClusterRole
  • ClusterRolebinding

The one difference between the 2 pairs is that the Cluster* counterparts are not namespaced (i.e. the Role has a namespace attribute and it is honored during all related actions)

The Role/ClusterRole have a rules attributes which follows the pattern

rules:
    - apiGroups: [<apigroup>,<apigroup>,...]
resources: [<resource1>,<resource2>,...]
verbs: [ <HTTP verb1>, <HTTP verb2>,...]
nonResourceURLs: [<url1>, <url2>,...]

Then the rules are bound to users/group via Rolebinding (namespaced) and ClusterRoleBinding (non namespaced, i.e. cluster wide). The entities have a roleRef and a Subjects atrribute that bind users/groups to roles

subjects:
    - kind: User
name: "user1"
apiGroup: rbac.authorization.k8s.io
    - kind: User
      name: "user2"
      apiGroup: rbac.authorization.k8s.io
roleRef:
    kind: Role
    name: myrole
    apiGroup: rbac.authorization.k8s.io

Node

Very specific to nodes authorization mode. It is only meant to be used to authorize kubelets. It matches system:nodes group with a username of system:node:<nodename>. This requires the NodeRestriction admission controller enabled. Having to authorized every node with a discrete username makes using many authentication modes impractical. The one left that makes sense is X509 client certs, but in our environment this is not really that doable given we use puppet certificates and not certificates managed by kubernetes

[1] http://blog.kubernetes.io/2017/04/rbac-support-in-kubernetes.html

Event Timeline

Change 386754 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] k8s::controller: support service account token signing

https://gerrit.wikimedia.org/r/386754

Change 386755 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Enable k8s::controller manager ServiceAccount signing

https://gerrit.wikimedia.org/r/386755

Change 386754 merged by Alexandros Kosiaris:
[operations/puppet@production] k8s::controller: support service account token signing

https://gerrit.wikimedia.org/r/386754

Change 386755 merged by Alexandros Kosiaris:
[operations/puppet@production] Enable k8s::controller manager ServiceAccount signing

https://gerrit.wikimedia.org/r/386755

Change 388122 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] kubernetes: Enable RBAC in production

https://gerrit.wikimedia.org/r/388122

Change 388122 merged by Alexandros Kosiaris:
[operations/puppet@production] kubernetes: Enable RBAC in production

https://gerrit.wikimedia.org/r/388122

Mentioned in SAL (#wikimedia-operations) [2017-11-03T15:49:55Z] <akosiaris> T177393 enable RBAC for kubernetes in production and staging

akosiaris triaged this task as Medium priority.Nov 6 2017, 9:09 AM
akosiaris updated the task description. (Show Details)

Authn wise, current consensus is to keep going with tokenauth authentication method, until we have more clearly figured out our use cases and possibly migrate to webhook authentication method in the future.

Authz wise, we are moving to RBAC

akosiaris claimed this task.

Change 391804 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Add k8s::kubeconfig define

https://gerrit.wikimedia.org/r/391804

Change 391805 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Add parameter for kubelet's kubeconfig

https://gerrit.wikimedia.org/r/391805

Change 391806 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Add kubeconfig parameter to k8s::proxy

https://gerrit.wikimedia.org/r/391806

Change 391804 merged by Alexandros Kosiaris:
[operations/puppet@production] Add k8s::kubeconfig define

https://gerrit.wikimedia.org/r/391804

Change 391805 merged by Alexandros Kosiaris:
[operations/puppet@production] Add parameter for kubelet's kubeconfig

https://gerrit.wikimedia.org/r/391805

Change 391806 merged by Alexandros Kosiaris:
[operations/puppet@production] Add kubeconfig parameter to k8s::proxy

https://gerrit.wikimedia.org/r/391806