Page MenuHomePhabricator

Toolforge: redeploy kyverno after the outage
Closed, ResolvedPublic

Description

We have learn a few lessons about kyverno and how to set it all up, in particular:

We are merging https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/337 to redeploy it.

In case of rollback:

Given we cannot easily 'undeploy' via the workflows in toolforge-deploy, a quick rollback of this would be:

  • manually delete the api-server webhooks:

kubectl delete validatingwebhookconfiguration kyverno-resource-validating-webhook-cfg

kubectl delete mutatingwebhookconfiguration kyverno-resource-mutating-webhook-cfg

  • manually scale down the kyverno replicas:

kubectl scale deploy kyverno-admission-controller -n kyverno --replicas 0

Event Timeline

The cluster seems happy, including kyverno, which is already auditing resources per the policies.

Example:

aborrero@tools-k8s-control-8:~$ sudo -i kubectl describe policy -n tool-arturo-kyverno-test-tool toolforge-kyverno-pod-policy
[...]
  Conditions:
    Last Transition Time:  2024-06-20T13:05:54Z
    Message:               Ready
    Reason:                Succeeded
    Status:                True
    Type:                  Ready
  Ready:                   true
  Rulecount:
    Generate:      0
    Mutate:        1
    Validate:      1
    Verifyimages:  0
Events:
  Type    Reason         Age                 From               Message
  ----    ------         ----                ----               -------
  Normal  PolicyApplied  108s                kyverno-admission  Job tool-arturo-kyverno-test-tool/once-with-retry: pass
  Normal  PolicyApplied  108s                kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-with-retry-rzfn5: pass
  Normal  PolicyApplied  108s                kyverno-admission  Job tool-arturo-kyverno-test-tool/once-with-retry: pass
  Normal  PolicyApplied  107s                kyverno-admission  Pod tool-arturo-kyverno-test-tool/myjob-77cdf666dc-v8jx8: pass
  Normal  PolicyApplied  107s                kyverno-admission  Job tool-arturo-kyverno-test-tool/once-without-retry: pass
  Normal  PolicyApplied  107s                kyverno-admission  Deployment tool-arturo-kyverno-test-tool/myjob: pass
  Normal  PolicyApplied  107s                kyverno-admission  Deployment tool-arturo-kyverno-test-tool/myjob: pass
  Normal  PolicyApplied  106s                kyverno-admission  Deployment tool-arturo-kyverno-test-tool/myjob3: pass
  Normal  PolicyApplied  106s                kyverno-admission  Pod tool-arturo-kyverno-test-tool/myjob3-7c6f88985-xhzd7: pass
  Normal  PolicyApplied  106s                kyverno-admission  Deployment tool-arturo-kyverno-test-tool/myjob3: pass
  Normal  PolicyApplied  68s (x2 over 68s)   kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-without-retry-hbgw4: pass
  Normal  PolicyApplied  68s (x2 over 68s)   kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-with-retry-rzfn5: pass
  Normal  PolicyApplied  68s (x2 over 68s)   kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-without-retry-hbgw4: pass
  Normal  PolicyApplied  68s                 kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-without-retry-hbgw4: pass
  Normal  PolicyApplied  68s (x2 over 68s)   kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-with-retry-rzfn5: pass
  Normal  PolicyApplied  67s (x2 over 107s)  kyverno-admission  Job tool-arturo-kyverno-test-tool/once-without-retry: pass
  Normal  PolicyApplied  67s (x2 over 68s)   kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-with-retry-rzfn5: pass
  Normal  PolicyApplied  67s (x2 over 107s)  kyverno-admission  Pod tool-arturo-kyverno-test-tool/once-without-retry-hbgw4: pass
  Normal  PolicyApplied  67s                 kyverno-admission  Job tool-arturo-kyverno-test-tool/once-with-retry: pass
  Normal  PolicyApplied  67s                 kyverno-admission  Job tool-arturo-kyverno-test-tool/once-without-retry: pass
  Normal  PolicyApplied  67s                 kyverno-admission  Job tool-arturo-kyverno-test-tool/once-with-retry: pass

checked a few things, both kyverno and the cluster seems happy.