Page MenuHomePhabricator

Figure out and document how to call the Kubernetes API as your tool user from inside a pod
Open, Needs TriagePublic

Description

Anomiebot's status page is a good use case for being able to find out what pods are running in a tool's namespace from inside of a pod in the namespace.

The info at https://kubernetes.io/docs/tasks/run-application/access-api-from-pod/ leads to the same problem documented at https://stackoverflow.com/questions/48311683/how-can-i-use-curl-to-access-the-kubernetes-api-from-within-a-pod. The issue is that the default serviceaccount credentials mounted into the pod do not have RBAC access to the API.

We have an ability to setup a special service account for any given tool which allows read-only access to all tenant namespaces. This is used by the k8s-status tool and documented at https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Maintenance#wmcs-k8s-enable-cluster-monitor. One downside of this method is that it requires using a custom Deployment rather than just webservice start to attach the credentials to the pod.

https://kubernetes.io/docs/reference/access-authn-authz/rbac/#service-account-permissions explains various ways that the default service account for a tool could be changed so that it can access the api.

Also, because we mount $HOME into the pod, it should be possible to use the tool's x509 certificate credentials from $HOME/.toolskube to get an auth token.

  • Document how to use the credentials from $HOME/.toolskube
  • Document how an admin could grant read-only API access to the default service account for a tool
  • Document how to request that your tool's default service account be granted read-only API access

Related Objects

Event Timeline

Document how an admin could grant read-only API access to the default service account for a tool
Document how to request that your tool's default service account be granted read-only API access

Is any of that actually needed? I have read+write access to my tool's namespace from within a pod, though no one has granted any special access AFAIK.

I have read+write access to my tool's namespace from within a pod, though no one has granted any special access AFAIK.

Are you doing that by authenticating using your tool's credentials from $HOME/.toolskube?

The steps about default service accounts would be designed to make it easier to get read-only access by using the service account's token that is automatically mounted at /var/run/secrets/kubernetes.io/serviceaccount/token in each Container.

Accessing the Kubernetes API from inside of a container using the default service account credentials:

$ APISERVER=https://kubernetes.default.svc
$ SERVICEACCOUNT=/var/run/secrets/kubernetes.io/serviceaccount
$ CA_CERT=${SERVICEACCOUNT}/ca.crt
$ TOKEN=$(cat ${SERVICEACCOUNT}/token)
$ NAMESPACE=$(cat ${SERVICEACCOUNT}/namespace)
$ curl --cacert $CA_CERT -H "Authorization: Bearer $TOKEN" "${APISERVER}"
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "forbidden: User \"system:serviceaccount:tool-bd808-test:default\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {

  },
  "code": 403
}

Because the tool's $HOME is mounted inside of our container, we can authenticate using the x509 certificate from $HOME/.toolskube.

$ APISERVER=https://kubernetes.default.svc
$ SERVICEACCOUNT=/var/run/secrets/kubernetes.io/serviceaccount
$ CA_CERT=${SERVICEACCOUNT}/ca.crt
$ NAMESPACE=$(cat ${SERVICEACCOUNT}/namespace)
$ CERT=$HOME/.toolskube/client.crt
$ KEY=$HOME/.toolskube/client.key
$ curl --silent --cacert $CA_CERT --key $KEY --cert $CERT "${APISERVER}/api/v1/namespaces/$NAMESPACE/pods/" |
  jq -r ".items[] | [.metadata.name, .status.phase] | @tsv"
bd808-test-77b666f66f-z87pw     Running
shell-1668200251        Running

A Toolforge admin can grant "view" rights to the default service account for a given tool:

$ kubectl sudo create rolebinding default-view \
      --clusterrole=view \
      --serviceaccount=tool-bd808-test:default \
      --namespace=tool-bd808-test
rolebinding.rbac.authorization.k8s.io/default-view created

With this rolebinding in place, the default service account can query for running pods:

$ APISERVER=https://kubernetes.default.svc
$ SERVICEACCOUNT=/var/run/secrets/kubernetes.io/serviceaccount
$ CA_CERT=${SERVICEACCOUNT}/ca.crt
$ TOKEN=$(cat ${SERVICEACCOUNT}/token)
$ NAMESPACE=$(cat ${SERVICEACCOUNT}/namespace)
$ curl --silent --cacert $CA_CERT -H "Authorization: Bearer $TOKEN" "${APISERVER}/api/v1/namespaces/$NAMESPACE/pods/" |
jq -r ".items[] | [.metadata.name, .status.phase] | @tsv"
bd808-test-77b666f66f-z87pw     Running
shell-1668205058        Running
taavi@tools-sgebastion-11:~ $ k sudo desc clusterrole view
Name:         view
Labels:       kubernetes.io/bootstrapping=rbac-defaults
              rbac.authorization.k8s.io/aggregate-to-edit=true
Annotations:  rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
  Resources                                    Non-Resource URLs  Resource Names  Verbs
  ---------                                    -----------------  --------------  -----
  bindings                                     []                 []              [get list watch]
  configmaps                                   []                 []              [get list watch]
  endpoints                                    []                 []              [get list watch]
  events                                       []                 []              [get list watch]
  limitranges                                  []                 []              [get list watch]
  namespaces/status                            []                 []              [get list watch]
  namespaces                                   []                 []              [get list watch]
  persistentvolumeclaims/status                []                 []              [get list watch]
  persistentvolumeclaims                       []                 []              [get list watch]
  pods/log                                     []                 []              [get list watch]
  pods/status                                  []                 []              [get list watch]
  pods                                         []                 []              [get list watch]
  replicationcontrollers/scale                 []                 []              [get list watch]
  replicationcontrollers/status                []                 []              [get list watch]
  replicationcontrollers                       []                 []              [get list watch]
  resourcequotas/status                        []                 []              [get list watch]
  resourcequotas                               []                 []              [get list watch]
  serviceaccounts                              []                 []              [get list watch]
  services/status                              []                 []              [get list watch]
  services                                     []                 []              [get list watch]
  controllerrevisions.apps                     []                 []              [get list watch]
  daemonsets.apps/status                       []                 []              [get list watch]
  daemonsets.apps                              []                 []              [get list watch]
  deployments.apps/scale                       []                 []              [get list watch]
  deployments.apps/status                      []                 []              [get list watch]
  deployments.apps                             []                 []              [get list watch]
  replicasets.apps/scale                       []                 []              [get list watch]
  replicasets.apps/status                      []                 []              [get list watch]
  replicasets.apps                             []                 []              [get list watch]
  statefulsets.apps/scale                      []                 []              [get list watch]
  statefulsets.apps/status                     []                 []              [get list watch]
  statefulsets.apps                            []                 []              [get list watch]
  horizontalpodautoscalers.autoscaling/status  []                 []              [get list watch]
  horizontalpodautoscalers.autoscaling         []                 []              [get list watch]
  cronjobs.batch/status                        []                 []              [get list watch]
  cronjobs.batch                               []                 []              [get list watch]
  jobs.batch/status                            []                 []              [get list watch]
  jobs.batch                                   []                 []              [get list watch]
  daemonsets.extensions/status                 []                 []              [get list watch]
  daemonsets.extensions                        []                 []              [get list watch]
  deployments.extensions/scale                 []                 []              [get list watch]
  deployments.extensions/status                []                 []              [get list watch]
  deployments.extensions                       []                 []              [get list watch]
  ingresses.extensions/status                  []                 []              [get list watch]
  ingresses.extensions                         []                 []              [get list watch]
  networkpolicies.extensions                   []                 []              [get list watch]
  replicasets.extensions/scale                 []                 []              [get list watch]
  replicasets.extensions/status                []                 []              [get list watch]
  replicasets.extensions                       []                 []              [get list watch]
  replicationcontrollers.extensions/scale      []                 []              [get list watch]
  nodes.metrics.k8s.io                         []                 []              [get list watch]
  pods.metrics.k8s.io                          []                 []              [get list watch]
  ingresses.networking.k8s.io/status           []                 []              [get list watch]
  ingresses.networking.k8s.io                  []                 []              [get list watch]
  networkpolicies.networking.k8s.io            []                 []              [get list watch]
  poddisruptionbudgets.policy/status           []                 []              [get list watch]
  poddisruptionbudgets.policy                  []                 []              [get list watch]

I don't see anything particularly sensitive on that list, except maybe pods/log. Most notably it doesn't grant access to secrets.

I think it should be OK to grant tool accounts this access to their own namespaces, but I'd like to have it managed with via something that's kept in version control instead of adding ad hoc objects to the Kubernetes cluster.

I think it should be OK to grant tool accounts this access to their own namespaces, but I'd like to have it managed with via something that's kept in version control instead of adding ad hoc objects to the Kubernetes cluster.

This sounds like a reasonable idea, but I'm not aware of any tool account namespace objects that are current handled this way. Today there are durable things in each tool's namespace like the quotas object that are created by maintain-kubeusers and transient things created by webservice. This would be more like a maintain-kubeusers managed thing than a webservice manged thing.

The thing to store/track is a relatively simple RoleBinding object that looks something like:

$ kubectl sudo get rolebinding/default-view -n tool-bd808-test -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  creationTimestamp: "2022-11-11T22:16:16Z"
  name: default-view
  namespace: tool-bd808-test
  resourceVersion: "929429243"
  uid: 610b4cb1-1001-42ec-9ebc-363e412eb8f6
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: view
subjects:
- kind: ServiceAccount
  name: default
  namespace: tool-bd808-test

Note that there's no stability or availability assurance for any of the k8s APIs (raw k8s APIs). I understand they are way more powerful than the APIs/abstractions that we do maintain on top of it, but we can´t offer any kind of assurance that your tools will not break, stop working or misbehave at any point (essentially, let there be dragons).

Provide something better that fits the requirements and I'll look at using it. Last I've heard there's nothing else at all available.

Provide something better that fits the requirements and I'll look at using it. Last I've heard there's nothing else at all available.

tl;dr; We are working on it :)

For all your toolforge-managed jobs and builds you can use those specific APIs directly (only direct calls for now, no clis supported yet T356377: [toolforge] simplify calling the different toolforge apis from within the containers), for webservices, we don't have yet an API, though we are working on it (ex. T352857: Toolforge next user stories - 2024 version, we will have more tasks created today hopefully).

For raw k8s stats/status, you can also use:
https://grafana-rw.wmcloud.org/d/TJuKfnt4z/kubernetes-namespace?orgId=1&var-cluster=prometheus-tools&var-namespace=tool-anomiebot

Note that I'm not saying you can't do it, or even that you should not do it, just making sure that you are aware of the tradeoffs you are making.