Page MenuHomePhabricator

Toolforge k8s: kube-scheduler permissions errors
Closed, ResolvedPublic

Description

We discovered some weird errors today:

aborrero@tools-k8s-control-4:~$ sudo -i kubectl -n kube-system logs kube-scheduler-tools-k8s-control-4 --timestamps=true | grep E0410
2023-04-10T09:37:05.835937370Z E0410 09:37:05.835810       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
2023-04-10T09:37:05.835979709Z E0410 09:37:05.835942       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836052589Z E0410 09:37:05.836010       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
2023-04-10T09:37:05.836129263Z E0410 09:37:05.836089       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
2023-04-10T09:37:05.836182127Z E0410 09:37:05.836148       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: nodes is forbidden: User "system:kube-scheduler" cannot list resource "nodes" in API group "" at the cluster scope
2023-04-10T09:37:05.836247538Z E0410 09:37:05.836210       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836286178Z E0410 09:37:05.836259       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: failed to list *v1.CSINode: csinodes.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csinodes" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836360037Z E0410 09:37:05.836306       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836374875Z E0410 09:37:05.836354       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:kube-scheduler" cannot list resource "pods" in API group "" at the cluster scope
2023-04-10T09:37:05.836423696Z E0410 09:37:05.836396       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:kube-scheduler" cannot list resource "namespaces" in API group "" at the cluster scope
2023-04-10T09:37:05.836477578Z E0410 09:37:05.836445       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:kube-scheduler" cannot list resource "replicasets" in API group "apps" at the cluster scope
2023-04-10T09:37:05.836529426Z E0410 09:37:05.836492       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumes" in API group "" at the cluster scope
2023-04-10T09:37:05.836578795Z E0410 09:37:05.836536       1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot list resource "configmaps" in API group "" in the namespace "kube-system"
2023-04-10T09:37:05.837883940Z E0410 09:37:05.837773       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PodDisruptionBudget: failed to list *v1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:kube-scheduler" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
2023-04-10T09:37:05.838070972Z E0410 09:37:05.838012       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:kube-scheduler" cannot list resource "statefulsets" in API group "apps" at the cluster scope

Startup logs:

aborrero@tools-k8s-control-4:~$ sudo -i kubectl -n kube-system logs -f kube-scheduler-tools-k8s-control-4 --timestamps=true
2023-04-10T09:37:02.645286071Z I0410 09:37:02.645109       1 serving.go:347] Generated self-signed cert in-memory
2023-04-10T09:37:05.738364890Z W0410 09:37:05.738251       1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system.  Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
2023-04-10T09:37:05.738519585Z W0410 09:37:05.738431       1 authentication.go:345] Error looking up in-cluster authentication configuration: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
2023-04-10T09:37:05.738533704Z W0410 09:37:05.738460       1 authentication.go:346] Continuing without authentication configuration. This may treat all requests as anonymous.
2023-04-10T09:37:05.738667778Z W0410 09:37:05.738575       1 authentication.go:347] To require authentication configuration lookup to succeed, set --authentication-tolerate-lookup-failure=false
2023-04-10T09:37:05.811666157Z I0410 09:37:05.811537       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
2023-04-10T09:37:05.811695239Z I0410 09:37:05.811582       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
2023-04-10T09:37:05.812141633Z I0410 09:37:05.812071       1 secure_serving.go:200] Serving securely on 127.0.0.1:10259
2023-04-10T09:37:05.816692024Z I0410 09:37:05.816587       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
2023-04-10T09:37:05.835937370Z E0410 09:37:05.835810       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
2023-04-10T09:37:05.835979709Z E0410 09:37:05.835942       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836052589Z E0410 09:37:05.836010       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
2023-04-10T09:37:05.836129263Z E0410 09:37:05.836089       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
2023-04-10T09:37:05.836182127Z E0410 09:37:05.836148       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: nodes is forbidden: User "system:kube-scheduler" cannot list resource "nodes" in API group "" at the cluster scope
2023-04-10T09:37:05.836247538Z E0410 09:37:05.836210       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836286178Z E0410 09:37:05.836259       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: failed to list *v1.CSINode: csinodes.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csinodes" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836360037Z E0410 09:37:05.836306       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope
2023-04-10T09:37:05.836374875Z E0410 09:37:05.836354       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:kube-scheduler" cannot list resource "pods" in API group "" at the cluster scope
2023-04-10T09:37:05.836423696Z E0410 09:37:05.836396       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:kube-scheduler" cannot list resource "namespaces" in API group "" at the cluster scope
2023-04-10T09:37:05.836477578Z E0410 09:37:05.836445       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:kube-scheduler" cannot list resource "replicasets" in API group "apps" at the cluster scope
2023-04-10T09:37:05.836529426Z E0410 09:37:05.836492       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumes" in API group "" at the cluster scope
2023-04-10T09:37:05.836578795Z E0410 09:37:05.836536       1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot list resource "configmaps" in API group "" in the namespace "kube-system"
2023-04-10T09:37:05.837883940Z E0410 09:37:05.837773       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PodDisruptionBudget: failed to list *v1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:kube-scheduler" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
2023-04-10T09:37:05.838070972Z E0410 09:37:05.838012       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:kube-scheduler" cannot list resource "statefulsets" in API group "apps" at the cluster scope
2023-04-10T09:37:07.514814687Z I0410 09:37:07.512505       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
2023-04-10T09:37:07.514862207Z I0410 09:37:07.513025       1 leaderelection.go:248] attempting to acquire leader lease kube-system/kube-scheduler...

In particular, this seems interesting:

W0410 09:37:05.738251       1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system.  Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'

Event Timeline

Toolsbeta doesn't have this:

aborrero@toolsbeta-test-k8s-control-4:~$ sudo -i kubectl -n kube-system logs kube-scheduler-toolsbeta-test-k8s-control-5 --timestamps=true | grep extension-apiserver-authentication
2023-04-03T11:14:40.221816218Z I0403 11:14:40.192405       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
2023-04-03T11:14:40.221821451Z I0403 11:14:40.192435       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
2023-04-03T11:14:40.221825485Z I0403 11:14:40.192776       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
2023-04-03T11:14:40.221829583Z I0403 11:14:40.192789       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
2023-04-03T11:14:40.304545893Z I0403 11:14:40.304483       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
2023-04-03T11:14:40.305737849Z I0403 11:14:40.305687       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
aborrero@toolsbeta-test-k8s-control-4:~$ sudo -i kubectl -n kube-system logs kube-scheduler-toolsbeta-test-k8s-control-6 --timestamps=true | grep extension-apiserver-authentication
2023-04-03T11:13:37.191627359Z I0403 11:13:37.191401       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
2023-04-03T11:13:37.191636788Z I0403 11:13:37.191418       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
2023-04-03T11:13:37.191645358Z I0403 11:13:37.191464       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
2023-04-03T11:13:37.191654473Z I0403 11:13:37.191481       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
2023-04-03T11:13:37.291974679Z I0403 11:13:37.291826       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
2023-04-03T11:13:37.292346194Z I0403 11:13:37.292205       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
aborrero@toolsbeta-test-k8s-control-4:~$ sudo -i kubectl -n kube-system logs kube-scheduler-toolsbeta-test-k8s-control-4 --timestamps=true | grep extension-apiserver-authentication
2023-04-03T11:14:39.900518285Z I0403 11:14:39.900320       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
2023-04-03T11:14:39.900568444Z I0403 11:14:39.900385       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
2023-04-03T11:14:39.900924818Z I0403 11:14:39.900725       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
2023-04-03T11:14:39.900952404Z I0403 11:14:39.900783       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
2023-04-03T11:14:40.000872581Z I0403 11:14:40.000680       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
2023-04-03T11:14:40.010344146Z I0403 11:14:40.002628       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 

The RBAC rules are correct:

root@tools-k8s-control-4:~# KUBECONFIG=/etc/kubernetes/scheduler.conf kubectl auth can-i get configmap/extension-apiserver-authentication -n kube-system
yes
root@tools-k8s-control-4:~# KUBECONFIG=/etc/kubernetes/scheduler.conf kubectl get configmap/extension-apiserver-authentication -n kube-system
NAME                                 DATA   AGE
extension-apiserver-authentication   6      3y155d

Also, the exact same between tools and toolsbeta:

aborrero@toolsbeta-test-k8s-control-4:~$ sudo -i kubectl get clusterrole system:kube-scheduler -o yaml > toolsbeta.yaml
aborrero@toolsbeta-test-k8s-control-4:~$ diff --color tools.yaml toolsbeta.yaml 
6c6
<   creationTimestamp: "2019-11-06T14:14:13Z"
---
>   creationTimestamp: "2019-10-23T09:55:16Z"
10,11c10,11
<   resourceVersion: "686879986"
<   uid: 96927118-ec26-4985-a6c5-931b11a902ea
---
>   resourceVersion: "320437235"
>   uid: 91985422-6f10-462a-a0d6-592660baf8ab
236d235
<
taavi claimed this task.

Fixed by manually restarting the kube-scheduler pod. Since these are static pods managed by Kubelet it needed a bit more tricks than usual to do:

taavi@tools-k8s-control-4:~$ ps -ax --forest | grep kube-scheduler
17734 pts/1    S+     0:00              \_ grep --color=auto kube-scheduler
 2140 ?        Ssl    0:12  \_ kube-scheduler --authentication-kubeconfig=/etc/kubernetes/scheduler.conf --authorization-kubeconfig=/etc/kubernetes/scheduler.conf --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/scheduler.conf --leader-elect=true --port=0
taavi@tools-k8s-control-4:~$ sudo kill 2140