Page MenuHomePhabricator

Establish replacement for PodPresets in Toolforge Kubernetes
Closed, ResolvedPublic

Description

PodPresets were unceremoniously removed from Kubernetes in version 1.20 ( this issue is a fine gateway to the chatter https://github.com/kubernetes/website/issues/24038). Since we are approaching 1.20 as quickly as is reasonable, newer versions we deploy should not depend on that.

This is enabled via feature flag in kubeadm right now. Right now, it uses a simple label match to apply the $HOME environment variable and to mount NFS in pods. That's really handy, but we can probably accomplish it in other ways as well. A simple replacement has been created by some RedHatters https://github.com/redhat-cop/podpreset-webhook. However, since we don't absolutely need this to be a general solution for setting up arbitrary presets, we could also implement a Mutating Admission Webhook similar to the validating admission webhooks used for ingress and registry controls. That applies the env and volume mounts to pods that are used with certain labels in Toolforge.

Currently, the label used is toolforge: tool, and the preset object looks like:

apiVersion: settings.k8s.io/v1alpha1
kind: PodPreset
metadata:
  name: mount-toolforge-vols
  namespace: tool-<toolname>
spec:
  env:
  - name: HOME
    value: /data/project/<toolname>
  selector:
    matchLabels:
      toolforge: tool
  volumeMounts:
  - mountPath: /public/dumps
    name: dumps
    readOnly: true
  - mountPath: /mnt/nfs/dumps-labstore1007.wikimedia.org
    name: dumpsrc1
    readOnly: true
  - mountPath: /mnt/nfs/dumps-labstore1006.wikimedia.org
    name: dumpsrc2
    readOnly: true
  - mountPath: /data/project
    name: home
  - mountPath: /etc/wmcs-project
    name: wmcs-project
    readOnly: true
  - mountPath: /data/scratch
    name: scratch
  - mountPath: /etc/ldap.conf
    name: etcldap-conf
    readOnly: true
  - mountPath: /etc/ldap.yaml
    name: etcldap-yaml
    readOnly: true
  - mountPath: /etc/novaobserver.yaml
    name: etcnovaobserver-yaml
    readOnly: true
  - mountPath: /var/lib/sss/pipes
    name: sssd-pipes
  volumes:
  - hostPath:
      path: /public/dumps
      type: Directory
    name: dumps
  - hostPath:
      path: /mnt/nfs/dumps-labstore1007.wikimedia.org
      type: Directory
    name: dumpsrc1
  - hostPath:
      path: /mnt/nfs/dumps-labstore1006.wikimedia.org
      type: Directory
    name: dumpsrc2
  - hostPath:
      path: /data/project
      type: Directory
    name: home
  - hostPath:
      path: /etc/wmcs-project
      type: File
    name: wmcs-project
  - hostPath:
      path: /data/scratch
      type: Directory
    name: scratch
  - hostPath:
      path: /etc/ldap.conf
      type: File
    name: etcldap-conf
  - hostPath:
      path: /etc/ldap.yaml
      type: File
    name: etcldap-yaml
  - hostPath:
      path: /etc/novaobserver.yaml
      type: File
    name: etcnovaobserver-yaml
  - hostPath:
      path: /var/lib/sss/pipes
      type: Directory
    name: sssd-pipes

Since users generally cannot create or alter these presets directly, and it is all keyed off a particular label on the pod (toolforge: tool), it's easily translated to a mutating webhook. Alternatively, the Redhat Community of Practice operator also would do the job.

Event Timeline

Bstorm created this task.

I'm slightly in favour of a mutating webhook here, as that would allow us to modify the mounts without having to modify the preset object in each individual namespace.

Mentioned in SAL (#wikimedia-cloud) [2021-09-13T15:44:14Z] <majavah> deploy volume-admission-controller in background; T279106

Mentioned in SAL (#wikimedia-cloud) [2021-09-14T15:45:09Z] <majavah> disable podpreset admission plugin in toolsbeta T279106

Mentioned in SAL (#wikimedia-cloud) [2021-09-20T12:44:38Z] <majavah> deploying volume-admission to tools, should not affect anything yet T279106

Just to clarify for those of us who create their own Kubernetes objects (👼): the toolforge: tool label will still have the same effect (of mounting the project home and everything else)?

Just to clarify for those of us who create their own Kubernetes objects (👼): the toolforge: tool label will still have the same effect (of mounting the project home and everything else)?

Yes, (at least for now) this is just changing what software adds those volume mounts.

Change 722362 had a related patch set uploaded (by Majavah; author: Majavah):

[labs/tools/maintain-kubeusers@master] Do not create PodPresets

https://gerrit.wikimedia.org/r/722362

@Bstorm and others: do you have options on Toolforge rollout? We can either disable the podpreset controller (fast to deploy and rollback but all at once) or delete the podpreset objects first (slow rollout and slow rollback, but can be done per individual tool).

@majavah does the tool cause change for already-deployed pods? The mounts are defined on the pod object when it spins up, so for users this is going to be transparent either way, right?

If it's transparent for users, I'm ok with disabling the controller. My only question is if we have 2000 pod preset objects hanging around in etcd being useless that we cannot delete later. It's not much storage, but it is storage. As long as you haven't seen some kind of strange behavior for users when the preset goes away, my preferred method would be to delete all the podpresets via script (all at once) and then disable the controller.

[...]

It only affects newly created pods, yes. The setting for the enabling the built-in controller is separate from being able to access the objects, so it's possible to delete the presets even after disabling the controller.

As long as k8s will let us delete them, and the maintain-kubeusers version that doesn't try to create them is deployed, I say kill the controller :)

Change 722362 merged by jenkins-bot:

[labs/tools/maintain-kubeusers@master] Do not create PodPresets

https://gerrit.wikimedia.org/r/722362

Mentioned in SAL (#wikimedia-cloud) [2021-09-23T17:14:34Z] <majavah> testing new maintain-kubeusers release T279106

Mentioned in SAL (#wikimedia-cloud) [2021-09-23T17:20:57Z] <majavah> deploying new maintain-kubeusers for lack of podpresets T279106

New tools on Toolforge will no longer have pod presets created for them.

Mentioned in SAL (#wikimedia-cloud) [2021-09-27T11:34:48Z] <majavah> disabling pod preset controller T279106

I'll leave things like they are for a day or so and clean up the old objects afterwards.

Change 724101 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] kubeadm: Disable PodPresets

https://gerrit.wikimedia.org/r/724101

Change 724101 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] kubeadm: Disable PodPresets

https://gerrit.wikimedia.org/r/724101

Done. I also added instructions to the repository's README for local development.

I'm confident enough with the new system so I'm going to run this script later today:

#!/bin/bash
# Run this script with your root/cluster admin account as appropriate.
# This will remove all unused PodPreset objects.

set -Eeuo pipefail

declare -a namespaces
readarray -t namespaces < <(kubectl get podpreset -A | grep mount-toolforge-vols | awk '{print $1}')

for ns in "${namespaces[@]}"
do
	echo "Removing pod preset for ${ns}"
	kubectl -n "$ns" delete podpresets mount-toolforge-vols
done

echo "*********************"
echo "Done!"

Mentioned in SAL (#wikimedia-cloud) [2021-10-07T08:00:09Z] <majavah> removing all pod presets T279106

Mentioned in SAL (#wikimedia-cloud) [2021-10-07T09:13:37Z] <majavah> disabling settings api, now that all pod presets are gone T279106

This is complete now, I think.