During docker stress testing we had multiple issues with kubernetes nodes (the ganeti ones) failing completely because calico-node was evicted from them. T289111 describes the aftermath of that exact situation (which stayed undetected mainly because eqiad is depooled).
Enabling the admission controller is the easy task but we will also want to limit the Kubernetes default priority classes `system-cluster-critical` and `system-node-critical` to only be used for Pods in namespaces we "trust" (like `kube-system` for services clusters, ml may need additional for istio, kf*).
This can be done by providing AdmissionConfiguration via kube-apiserver flag `--admission-control-config-file` like:
```lang=yaml
apiVersion: apiserver.k8s.io/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
configuration:
apiVersion: resourcequota.admission.k8s.io/v1beta1
kind: Configuration
limitedResources:
- resource: pods
matchScopes:
- scopeName: PriorityClass
operator: In
values:
- system-cluster-critical
- system-node-critical
```
And explicitly granting namespaces the permission to use those classes by adding a ResourceQuota object:
```lang=yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: priotiryclass
namespace: kube-system
spec:
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values:
- system-cluster-critical
- system-node-critical
```
Relevant reads:
* https://people.wikimedia.org/~jayme/k8s-docs/v1.16/docs/reference/access-authn-authz/admission-controllers/#priority
* https://people.wikimedia.org/~jayme/k8s-docs/v1.16/docs/concepts/configuration/pod-priority-preemption/
* https://people.wikimedia.org/~jayme/k8s-docs/v1.16/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
* https://people.wikimedia.org/~jayme/k8s-docs/v1.16/docs/concepts/policy/resource-quotas/#limit-priority-class-consumption-by-default