Page MenuHomePhabricator

PodSecurityPolicies will be deprecated with Kubernetes 1.21
Open, HighPublic

Description

Pod Security Policies (PSP), starting with the Kubernetes 1.21, will begin the process of deprecation with the intention to fully remove it in a future release. ...
Full blog post draft here)
Github issue at: https://github.com/kubernetes/kubernetes/pull/97171

While we started implementing PSPs in T228967, they never fully made it to our clusters (as of k8s <1.16).
With Kubernetes 1.16 upgrades we want to implement the recommended restrictions as far as possible without to much effort (that we might have to re-spend with the deprecation). Although there are alternative options around currently, we still have some time and it can be assumed that those options evolve in the near future and we can migrate off of PSPs at a later point.

Things we currently enforce via PSPs (which a replacement needs to provide as well):

  • Prohibit running privileged containers, hostIPC, hostPID, hostNetwork
  • Ensure containers run as non-root
  • Restrict the use of volumes (only specific volume plugins, only specific host paths)
  • Prohibit containers with fsGroup/supplementalGroup of 0
  • Ensure capabilities are dropped

Apart from the "privileged" policy that effectively allows everything (required for things like calico for example) we only have two PSPs in wikikube. For details about them, see helmfile.d/admin_ng/helmfile_psp.yaml.

WMCS tasks about it:

Research about the way forward / alternatives

https://wikitech.wikimedia.org/wiki/User:JMeybohm/PSP_Replacement

Preparation for migrating away from PSPs

Run validation against the restricted PSS profile

Remove seccomp.security.alpha.kubernetes.io/* annotations from PSPs

Current state

Currently (with PSPs) incoming Pods are mutated by the apiserver which adds securityContext.seccompProfile.type: RuntimeDefault to all containers of that pod that have not explicitly set the seccomp profile (controlled by the PSP annotation seccomp.security.alpha.kubernetes.io/defaultProfileName).
This means that effectively all our containers have securityContext.seccompProfile.type: set to either RuntimeDefault or whatever they choose (check with P60658).

The PSP does also validate the securityContext.seccompProfile.type field of all containers against a list of allowed profiles (controlled by the PSP annotation seccomp.security.alpha.kubernetes.io/allowedProfileNames). If the profile selected by a container is not in the list, the Pod is rejected.

Post PSP state

PSA/PSS can't do any mutation. Because of that the need to provide the valid Podspec to the apiserver for validation to pass.

There is a SeccompDefault v1.23 alpha feature | Kubernetes that enforces the runtime/default seccomp profile to all containers on container runtime level, which means the containers do always run with the default seccomp profile but this is not visible in the API - making containers without an explicit seecurityContext.seccompProfile.type fail PSS (restricted) validation. The baseline PSS profile does allow securityContext.seccompProfile.type to be unset.

Path forward

During the migration to PSS we need to allow all seccomp profiles because even if mutation (defaultProfileName) is disabled , the PSP will still validate the seccomp.security.alpha.kubernetes.io/pod annotation (but not securityContext.seccompProfile).

Remove apparmor.security.beta.kubernetes.io/* annotations from PSPs

Current state

This is more or less the same as with seccomp above. The difference is that there is no field in Podspec for the apparmor profile. Instead the apparmor profile is added as a list of annotations (one for each container) to the Pod's annotations (container.apparmor.security.beta.kubernetes.io/<container_name>: runtime/default).

The container runtime does enforce the apparmor profile runtime/default if none is chosen explicitly (like with the SeccompDefault feature, but out-of-the-box), check with P60658.

Post PSP state

Same restrictions apply in term of mutation, but PSS(baseline) and PSS(restricted) do allow for the apparmor profile to not be set (falling back to the default of the container runtime), so we pass validation for both PSS profiles without changes.

Path forward

Unfortunately PSPs don't support "Undefined/nil" as a allowed apparmor profile name. So we would either have to add the default profile to all containers (which is rather complicated to do with out templating as it is a pod level annotation for each container) or we disable PSP validation of apparmor profiles as well during the migration.

Add basic securityContext to all containers T362978

To pass PSS(restricted) all containers will need to explicitly define (at least):

securityContext:
  allowPrivilegeEscalation: false
  capabilities:
     drop:
     - ALL
  runAsNonRoot: true
  seccompProfile:
    type: RuntimeDefault

Todos

  • Add basic securityContext to all containers: T362978

The following steps need to/can be done on a per cluster basis, controlled by the PodSecurityStandards structure in helmfile.d/admin_ng/values/*/values.yaml:

# Configure the default PodSecurityStandard settings, see: T273507
PodSecurityStandard:
  disablePSPMutations: true  # Disable PSP mutation, allow all seccomp profiles 
  enforce: true              # Enforce the PodSecurityStandard profile "restricted"
  disableRestrictedPSP: true # Disable PSP binding for the restricted PSP

wikikube-staging

  • Disable PSP mutation, allow all seccomp profiles (seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*')
  • Enforce restricted PSS on all namespaces that use restricted PSP
  • Disable the restricted PSP for these namespaces

wikikube (and all others)

  • Disable PSP mutation, allow all seccomp profiles (seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*')
  • Enforce restricted PSS on all namespaces that use restricted PSP
  • Disable the restricted PSP for these namespaces

Migrate all the clusters

Remove related code from admin_ng

Might not be feasible to do right away as wikikube has keep the MediaWiki PSP around until after the k8s upgrade

  • Code that adds PSPs
  • Code that adds PSP rolebindings

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/deployment-chartsmaster+39 -20
operations/deployment-chartsmaster+4 -0
operations/deployment-chartsmaster+19 -6
operations/deployment-chartsmaster+6 -0
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+19 -7
operations/deployment-chartsmaster+13 -7
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+8 -7
operations/deployment-chartsmaster+43 -5
operations/puppetproduction+37 -2
operations/deployment-chartsmaster+1 -7
operations/puppetproduction+1 -2
operations/deployment-chartsmaster+15 -0
operations/deployment-chartsmaster+0 -9
operations/deployment-chartsmaster+2 -2
operations/puppetproduction+1 -0
operations/puppetproduction+2 -1
operations/puppetproduction+59 -4
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I would personally try to spend some time in understanding the ValidationAdmissionPolicy feature before starting a big work of moving all our clusters to OPA Gatekeeper.

I did that. Unfortunately we can't permit something in a Validation Admission Policy that is forbidden by PSS (at that check comes first). So "extending" the PSS for the mediawiki namespaces (by allowing ptrace/hostPath again) is unfortunately not an option.

With that I do think we have the following options to proceed:

  1. Exempt mw namespaces (from PSS/PSA), e.g. running without checks
  2. Allow privileged in mw namespaces
  3. Use some 3rd party controller like opa-gatekeeper for the mw namespaces
  4. Allow privileged in mw namespaces, but create ValidationAdmissionPolicies basically re-implementing the restricted profile with the exemption of ptrace/hostPath for geoip

The last option will unfortunately leave a gap as we can't migrate to ValidationAdmissionPolicies early (as in "before the k8s upgrade") because the feature is only available in k8s >=1.26 (>=1.28 really if we want to avoid alpha). It will probably also be a bit work to get it right and we might upgrade the ValidationAdmissionPolicies with k8s releases (reflecting changes to the PSS). Overall this looks doable and might be better than relying on some 3rd party thing that might be phased out with ValidationAdmissionPolicies becoming the new standard. I'll try to prove the viability of this idea in a demo setup.

I'm not super convinced of 4...writing ValidationAdmissionPolicies is quite complex and there are so many corner cases. I tried implementing the first restrictions from the baseline profile (I think we need around 10) and it's already huge:

apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
  name: "rebuild-baseline"
spec:
  matchConstraints:
    resourceRules:
    - apiGroups:   ["apps"]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["deployments", "statefulsets", "daemonsets", "jobs"]
  validations:
  - message: "Containers must drop ALL capabilities and might only add back SYS_PTRACE"
    expression: |
      // Containers must drop `ALL` capabilities,
      (
        object.spec.template.spec.containers +
        (has(object.spec.template.spec.initContainers) ? object.spec.template.spec.initContainers : []) +
        (has(object.spec.template.spec.ephemeralContainers) ? object.spec.template.spec.ephemeralContainers : [])
      ).all(container,
        has(container.securityContext) &&
        has(container.securityContext.capabilities) &&
        has(container.securityContext.capabilities.drop) &&
        size(container.securityContext.capabilities.drop) >= 1 &&
        container.securityContext.capabilities.drop.exists(c, c == 'ALL')
      ) &&
      // and are only permitted to add back the `SYS_PTRACE` capability
      (
        object.spec.template.spec.containers +
        (has(object.spec.template.spec.initContainers) ? object.spec.template.spec.initContainers : []) +
        (has(object.spec.template.spec.ephemeralContainers) ? object.spec.template.spec.ephemeralContainers : [])
      ).all(container,
        !has(container.securityContext) ||
        !has(container.securityContext.capabilities) ||
        !has(container.securityContext.capabilities.add) ||
        container.securityContext.capabilities.add.all(cap, cap == 'SYS_PTRACE')
      )
  - message: "securityContext.runAsNonRoot must be set on Pod or Container level and may not be false"
    expression: |
      // Pod or Containers must set `securityContext.runAsNonRoot`
      (
        (
          has(object.spec.template.spec.securityContext) &&
          has(object.spec.template.spec.securityContext.runAsNonRoot)
        ) ||
        object.spec.template.spec.containers.all(container,
          has(container.securityContext) && has(container.securityContext.runAsNonRoot)) 
        // No need to check initContainer and ephemeralContainer here as container is required
      )
      &&
      // Neither Pod nor Containers should set `securityContext.runAsNonRoot` to false
      (
        (
          // Pod should not set runAsNonRoot to false
          !has(object.spec.template.spec.securityContext) ||
          !has(object.spec.template.spec.securityContext.runAsNonRoot) ||
          object.spec.template.spec.securityContext.runAsNonRoot != false
        ) &&
        (
          (
            object.spec.template.spec.containers +
            (has(object.spec.template.spec.initContainers) ? object.spec.template.spec.initContainers : []) +
            (has(object.spec.template.spec.ephemeralContainers) ? object.spec.template.spec.ephemeralContainers : [])
          ).all(container,
            !has(container.securityContext) ||
            !has(container.securityContext.runAsNonRoot) ||
            container.securityContext.runAsNonRoot != false
          )
        )
      )

Plus this is still not super restrictive as it only covers "deployments", "statefulsets", "daemonsets" and "jobs". We need a second set with the same rules but matching pods, so that we don't have to cover every possible way a pod could be spawned (cronjobs, operators creating pods directly, probably other ways I did not think of yet). That would have the downside of late errors, e.g. deployments would go through but pod's won't be created later on. But otoh that downside also applies to current PSP's, so maybe it's fine to only implement rules for Pods and go with that.

I've summarized my findings at https://wikitech.wikimedia.org/wiki/User:JMeybohm/PSP_Replacement @akosiaris, @elukey: I'd like you to take a look and ask questions if you find the time.

@JMeybohm thanks a lot for the great wikipage, it explains the problem very well. The only thing that worries me is the maintenance of those extra policies, since multiple things can fail (Kyverno can stop/change their support, etc..) and also it would add a big and complicated step when upgrading to future k8s versions. I don't see any other way forward though, so I am +1 with your proposal.

Side note: we could try to avoid the hostPath bundling GeoIP inside mw Docker images, to reduce the scope of the "exceptions" to SYS_PTRACE.

During the SIG meeting we wondered what is the feedback that a deployer would get from PSS vs VAP+CEL, we knew the latter (namely the Deployment/Pod/etc.. resources are allowed to be created but the corresponding resource would not be created if a policy is breached) but not the former.

I tried a little test on minikube, creating a test namespace and applying to it the PSS restricted profile. I tried then to create a pod object that violates its restriction, and I got this error straight away:

Error from server (Forbidden): error when creating "test_pod.yaml": pods "test-pd" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "test-container" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "test-container" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "test-volume" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "test-container" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "test-container" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

Posting the yaml as well:

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: k8s.gcr.io/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /test-pd
      name: test-volume
  volumes:
  - name: test-volume
    hostPath:
      # directory location on host
      path: /data
      # this field is optional
      type: DirectoryOrCreate

In my case I wanted to trigger the hostPath restriction but others were breached as well.

The test shows that we'd have two different feedback for deployers:

  • For namespaces using PSS (probably most of the current ones) we'd get an error while deploying, so the pod resources wouldn't be created.
  • For namespaces using VAP (Mediawiki for the moment) we wouldn't get an error while deploying, but we'd not see resources/pods being created (getting feedback from stuff like kubectl get events IIUC).

Not a big deal but probably worth to be highlighted during the decision time. Still keep my vote to proceed with Janis' solution of course!

Do the PSS give the same early feedback even with Deployment objects?

Tested this random example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-nas
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        command: ["sh", "-c"]
        args: ["sleep 10000"]
        securityContext:
          privileged: true
          capabilities:
            add: ["SYS_ADMIN"]
          allowPrivilegeEscalation: true
        volumeMounts:
          - name: dynamic-volume
            mountPropagation: "Bidirectional"
            mountPath: "/dynamic-volume"
      volumes:
        - name: dynamic-volume
          hostPath:
            path: /mnt/dynamic-volume
            type: DirectoryOrCreate

And the result was:

Warning: would violate PodSecurity "restricted:latest": privileged (container "nginx" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]; container "nginx" must not include "SYS_ADMIN" in securityContext.capabilities.add), restricted volume types (volume "dynamic-volume" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

deployment.apps/test-deployment created

So the Deployment resource gets created, with a warning. Then its status is of course not healthy:

NAME              READY   UP-TO-DATE   AVAILABLE   AGE
test-deployment   0/1     0            0           2m33s

And get events shows:

2m59s       Warning   FailedCreate        replicaset/test-deployment-5b8457ff6   Error creating: pods "test-deployment-5b8457ff6-cxss2" is forbidden: violates PodSecurity "restricted:latest": privileged (container "nginx" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]; container "nginx" must not include "SYS_ADMIN" in securityContext.capabilities.add), restricted volume types (volume "dynamic-volume" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

Change #1015354 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] k8s/apiserver: Add option to configure audit logging

https://gerrit.wikimedia.org/r/1015354

Change #1015354 merged by JMeybohm:

[operations/puppet@production] k8s/apiserver: Add option to configure audit logging

https://gerrit.wikimedia.org/r/1015354

Change #1016721 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] k8s/apiserver: Fix parameter syntax for --audit-log-maxsize

https://gerrit.wikimedia.org/r/1016721

Change #1016721 merged by JMeybohm:

[operations/puppet@production] k8s/apiserver: Fix parameter syntax for --audit-log-maxsize

https://gerrit.wikimedia.org/r/1016721

Change #1016753 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] k8s: Enable audit logging in staging-eqiad

https://gerrit.wikimedia.org/r/1016753

Change #1016753 merged by JMeybohm:

[operations/puppet@production] k8s: Enable audit logging in staging-eqiad

https://gerrit.wikimedia.org/r/1016753

Change #1018950 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Refactor fetching pspClusterRole for namespaces

https://gerrit.wikimedia.org/r/1018950

Change #1018951 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Stop adding kubernetes.io/metadata.name namespace label

https://gerrit.wikimedia.org/r/1018951

Change #1018952 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Enable restriced PSS profile in audit mode in staging

https://gerrit.wikimedia.org/r/1018952

Change #1018950 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Refactor fetching pspClusterRole for namespaces

https://gerrit.wikimedia.org/r/1018950

Change #1018951 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Stop adding kubernetes.io/metadata.name namespace label

https://gerrit.wikimedia.org/r/1018951

Change #1018952 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Enable restriced PSS profile in audit mode in staging

https://gerrit.wikimedia.org/r/1018952

Change #1019282 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] kubernetes::master: Add support for configuring feature gates

https://gerrit.wikimedia.org/r/1019282

I've added a more comprehensive list of @elukey's test at https://wikitech.wikimedia.org/wiki/User:JMeybohm/PSP_Replacement#Violation_error_handling
Bottom line is: With PSP's and VAP's we only get events, with PSS and kyverno we get additional user warnings (or even full rejections in case of kyverno)

JMeybohm updated the task description. (Show Details)
JMeybohm updated the task description. (Show Details)

Change #1020186 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] k8s: Enable audit logging for all clusters

https://gerrit.wikimedia.org/r/1020186

Change #1020187 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Enable restriced PSS profile in audit mode

https://gerrit.wikimedia.org/r/1020187

Change #1020186 merged by JMeybohm:

[operations/puppet@production] k8s: Enable audit logging for all clusters

https://gerrit.wikimedia.org/r/1020186

Change #1020187 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Enable restriced PSS profile in audit mode

https://gerrit.wikimedia.org/r/1020187

JMeybohm updated the task description. (Show Details)

Change #1020313 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] helmfile_psp: Remove seccomp/apparmor mutations from PSP

https://gerrit.wikimedia.org/r/1020313

Change #1019282 abandoned by JMeybohm:

[operations/puppet@production] kubernetes::node: Add support for the SeccompDefault feature gate

Reason:

this is not required, we need to patch containers anyways

https://gerrit.wikimedia.org/r/1019282

JMeybohm updated the task description. (Show Details)

Adding as info since it was requested in T362408#9712356

Just started up a simple k0s cluster with 1 controller and 1 worker

root@worker-0:~# k0s ctr version
Client:
  Version:  1.7.13
  Revision: 7c3aca7a610df76212171d200ca3811ff6096eb8
  Go version: go1.21.6

Server:
  Version:  1.7.13
  Revision: 7c3aca7a610df76212171d200ca3811ff6096eb8
  UUID: 7886cf88-de05-478c-ab46-e1024f33fa7d

Note that the hosts are Debian bookworm, but containerd isn't the distro one and is the one brought in by k0s. Version is 1.7.13. I am gonna retry the experiment with 1.6.20 too, but this was the easiest and fastest way I had to my disposal to get a quick confirmation that indeed there's a default apparmor profile applied.

and

root@worker-0:~# apparmor_status 
apparmor module is loaded.
11 profiles are loaded.
11 profiles are in enforce mode.
   /usr/bin/man
   /usr/lib/NetworkManager/nm-dhcp-client.action
   /usr/lib/NetworkManager/nm-dhcp-helper
   /usr/lib/connman/scripts/dhclient-script
   /{,usr/}sbin/dhclient
   cri-containerd.apparmor.d
   lsb_release
   man_filter
   man_groff
   nvidia_modprobe
   nvidia_modprobe//kmod
0 profiles are in complain mode.
0 profiles are in kill mode.
0 profiles are in unconfined mode.
6 processes have profiles defined.
6 processes are in enforce mode.
   /bin/hcloud-cloud-controller-manager (10295) cri-containerd.apparmor.d
   /proxy-agent (10458) cri-containerd.apparmor.d
   /coredns (10697) cri-containerd.apparmor.d
   /csi-node-driver-registrar (10737) cri-containerd.apparmor.d
   /livenessprobe (10844) cri-containerd.apparmor.d
   /bin/node_exporter (12862) cri-containerd.apparmor.d
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.
0 processes are in mixed mode.
0 processes are in kill mode.

I 've done 0 apparmor related configuration. Didn't apply any annotations or anything related.

So we got a default apparmor profile, named cri-containterd.apparmor.d. The profile isn't present in the filesystem, but as @JMeybohm pointed out it's hardcoded in https://github.com/containerd/containerd/blob/main/contrib/apparmor/template.go. First introduction 7 years ago in https://github.com/containerd/containerd/commit/2b46989dbeb037b296f70ac25faff39995164814

Adding for bookwork

root@containerd:~# ctr version
Client:
  Version:  1.6.20~ds1
  Revision: 1.6.20~ds1-1+b1
  Go version: go1.19.8

Server:
  Version:  1.6.20~ds1
  Revision: 1.6.20~ds1-1+b1
  UUID: 86629b04-d48d-4f60-ad13-81c39831d3ff

ctr needs to be told to use apparmor, otherwise it doesn't

root@containerd:~# ctr run --apparmor-default-profile cri-containerd.apparmor.d docker.io/library/nginx:latest nginx1`

and

root@containerd:~# apparmor_status 
apparmor module is loaded.
12 profiles are loaded.
12 profiles are in enforce mode.
   /usr/bin/man
   /usr/lib/NetworkManager/nm-dhcp-client.action
   /usr/lib/NetworkManager/nm-dhcp-helper
   /usr/lib/connman/scripts/dhclient-script
   /{,usr/}sbin/dhclient
   cri-containerd.apparmor.d
   lsb_release
   man_filter
   man_groff
   nerdctl-default
   nvidia_modprobe
   nvidia_modprobe//kmod
0 profiles are in complain mode.
0 profiles are in kill mode.
0 profiles are in unconfined mode.
3 processes have profiles defined.
3 processes are in enforce mode.
   /usr/sbin/nginx (2718) cri-containerd.apparmor.d
   /usr/sbin/nginx (2753) cri-containerd.apparmor.d
   /usr/sbin/nginx (2754) cri-containerd.apparmor.d
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.
0 processes are in mixed mode.
0 processes are in kill mode.

Note that I also used nerdctl and it ships it's own apparmor profile. This might turn out to be confusing when debugging as it is a difference from docker, where client and daemon have a more tight relationship and we don't end up easily with differing apparmor profiles depending on choice of client.

I 've also just run kubelet 1.23 in standalone mode talking to containerd and indeed processes in containers run with cri-containerd.apparmor.d apparmor profile.

Change #1020313 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Add toggles for PSP to PSS migration

https://gerrit.wikimedia.org/r/1020313

Change #1047918 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Enforce PSS restricted on namespaces that where PSP restriced

https://gerrit.wikimedia.org/r/1047918

Change #1047918 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Enforce PSS restricted on namespaces that where PSP restriced

https://gerrit.wikimedia.org/r/1047918

Change #1047934 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Re-enable restricted PSP for staging

https://gerrit.wikimedia.org/r/1047934

Change #1047934 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Re-enable restricted PSP for staging

https://gerrit.wikimedia.org/r/1047934

Change #1048453 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Bind to privileged PSP if restricted PSP is disabled

https://gerrit.wikimedia.org/r/1048453

Change #1048453 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Bind to privileged PSP if restricted PSP is disabled

https://gerrit.wikimedia.org/r/1048453

Change #1049112 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Bind to privileged PSP if restricted PSP is disabled

https://gerrit.wikimedia.org/r/1049112

Change #1049112 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Bind to privileged PSP if restricted PSP is disabled

https://gerrit.wikimedia.org/r/1049112

Change #1049123 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: disableRestrictedPSP in staging-eqiad

https://gerrit.wikimedia.org/r/1049123

Change #1049123 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: disableRestrictedPSP in staging-eqiad

https://gerrit.wikimedia.org/r/1049123

Change #1051133 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Switch eqiad und codfw wikikube clusters to PSS

https://gerrit.wikimedia.org/r/1051133

Change #1051133 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Switch eqiad und codfw wikikube clusters to PSS

https://gerrit.wikimedia.org/r/1051133

Mentioned in SAL (#wikimedia-operations) [2024-07-02T11:24:42Z] <jayme> switched wikikube production clusters from PSP to PSS for restricted namespaces - T273507

JMeybohm changed the task status from Open to Stalled.Jul 8 2024, 10:19 AM
JMeybohm updated the task description. (Show Details)
JMeybohm updated the task description. (Show Details)

Change #1124415 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] validating-admission-policies: Be more explicit in tests

https://gerrit.wikimedia.org/r/1124415

Change #1124416 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Add pod-security.wmg.org labels to mediawiki namespaces

https://gerrit.wikimedia.org/r/1124416

Change #1124415 merged by jenkins-bot:

[operations/deployment-charts@master] validating-admission-policies: Be more explicit in tests

https://gerrit.wikimedia.org/r/1124415

Change #1124830 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Disable hostPath and capabilities baseline rules for mediawiki

https://gerrit.wikimedia.org/r/1124830

Change #1124830 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Disable hostPath and capabilities baseline rules for mediawiki

https://gerrit.wikimedia.org/r/1124830

Change #1124416 merged by jenkins-bot:

[operations/deployment-charts@master] Add pod-security.wmf.org labels to wikikube mediawiki namespaces

https://gerrit.wikimedia.org/r/1124416

Nice! Now this is stalled by the k8s upgrade on the wikikube production clusters (T341984: Update Kubernetes clusters to 1.31) after which we can use VAPs for MediaWiki and remove PSP related stuff from puppet and deployment-charts repos.

JMeybohm changed the task status from Stalled to Open.Nov 14 2025, 10:49 AM

Now that the wikikube clusters have been migrated, we can start working on removing the PSP related code