Page MenuHomePhabricator

Migrate to helm v3
Closed, ResolvedPublic

Description

Meta-task tracking things needed for helm 3 migration

Pre-Migration tasks

  • import and build helm3
  • ensure compatibility of helm plugins
  • add helm3 stuff to puppet
  • figure a way to gradually migrate helmfile.d/services (this should only be done after k8s has been upgraded to >= 1.16 in all clusters)
  • alternatively find a way to migrate all services at once (on a depooled cluster), verify success and do so on the second one
  • We need an alternative to recreate pods
  • We might need additional annotations to helm tests (see comments)
  • Find replacement and new workflow for missing Tiller serviceaccount (new RBAC rules and modifications in helmfiles)
  • Create and verify plan for migration of codfw and eqiad (see T251305#7492328)

Migration tasks:

  • verify migration path on staging-codfw
  • re-deploy and verify staging
  • re-deploy and verify codfw
  • re-deploy and verify eqiad

Post migration tasks:

  • remove tiller and tiller service accounts (742989)
  • uninstall helm2 from hosts (deploy, releases,contint), make helm3 default (753026)
  • Cleanup boilderplate code in helmfiles (737034)
  • remove environment state values helmBinary (751067)
  • bump charts to v2 api version T295750
  • remove helm2 from CI and rake files (746864, 747147, 747487, 747814, 748701)
  • remove depricated "helm.sh/hook": test-success annotation (757877 and also related to T276949)

Details

SubjectRepoBranchLines +/-
operations/docker-images/production-imagesmaster+0 -205
operations/deployment-chartsmaster+0 -38
operations/deployment-chartsmaster+81 -57
operations/puppetproduction+36 -37
operations/deployment-chartsmaster+10 -10
operations/deployment-chartsmaster+144 -177
operations/puppetproduction+0 -7
operations/deployment-chartsmaster+549 -590
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+2 -19
operations/deployment-chartsmaster+17 -13
operations/deployment-chartsmaster+4 -5
integration/configmaster+1 -1
integration/configmaster+10 -3
operations/deployment-chartsmaster+1 -180
operations/puppetproduction+2 -0
operations/puppetproduction+2 -0
operations/puppetproduction+2 -0
operations/deployment-chartsmaster+141 -95
operations/deployment-chartsmaster+755 -37
operations/deployment-chartsmaster+6 -4
operations/deployment-chartsmaster+21 -1
operations/puppetproduction+1 -0
operations/puppetproduction+106 -35
operations/puppetproduction+105 -35
labs/privatemaster+74 -0
operations/puppetproduction+1 -0
operations/deployment-chartsmaster+2 -2
operations/puppetproduction+3 -1
operations/deployment-chartsmaster+48 -6
operations/deployment-chartsmaster+12 -2
operations/deployment-chartsmaster+5 -5
integration/configmaster+1 -1
integration/configmaster+19 -3
operations/debs/helmmaster+10 -1
operations/puppetproduction+5 -5
operations/puppetproduction+46 -2
operations/debs/helm-diffmaster+11 -2
operations/debs/helmfilemaster+168 -7
operations/debs/helm3master+19 -0
integration/configmaster+5 -1
operations/debs/helm3master+239 -0
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:14:43Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mobileapps.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:14:45Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on proton.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:14:48Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on proton.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:14:51Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on push-notifications.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:14:54Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on push-notifications.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:14:57Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on recommendation-api.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:00Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on recommendation-api.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:03Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on sessionstore.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:06Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on sessionstore.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:09Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on shellbox.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:12Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on shellbox.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:15Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on shellbox-constraints.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:18Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on shellbox-constraints.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:21Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on shellbox-media.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:25Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on shellbox-media.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:28Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on shellbox-syntaxhighlight.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:31Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on shellbox-syntaxhighlight.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:34Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on shellbox-timeline.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:36Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on shellbox-timeline.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:40Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on similar-users.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:43Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on similar-users.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:46Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on tegola-vector-tiles.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:49Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on tegola-vector-tiles.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:52Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on termbox.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:55Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on termbox.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:15:58Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on wikifeeds.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:16:01Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on wikifeeds.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:16:03Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on zotero.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T09:16:07Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on zotero.svc.codfw.wmnet with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-24T10:06:55Z] <jelto> downtime PyBal backends health check for helm3 de-deploy T251305. I'm keeping an eye on icing and remove downtime as soon as I'm finished

Change 736822 merged by Jelto:

[operations/puppet@production] hiera::role::common::deployment_server update helmBinary codfw

https://gerrit.wikimedia.org/r/736822

The re-deploy of codfw was successful. Some take-aways are added here which came up in the codfw migration. The plan to migrate eqiad Kubernetes to helm3:

  • Announce maintenance some days ahead on ops list
  • Downtime PyBal backends health check (and keep one eye on icinga, because this downtime is quite generic)
  • Downtime Kubernetes services in eqiad (according to T277740).
cookbook sre.hosts.downtime -r "helm3 de-deploy T251305" -H 3 --force 'apertium.svc.eqiad.wmnet,api-gateway.svc.eqiad.wmnet,apple-search.svc.eqiad.wmnet,blubberoid.svc.eqiad.wmnet,citoid.svc.eqiad.wmnet,cxserver.svc.eqiad.wmnet,echostore.svc.eqiad.wmnet,eventgate-analytics.svc.eqiad.wmnet,eventgate-analytics-external.svc.eqiad.wmnet,eventgate-logging-external.svc.eqiad.wmnet,eventgate-main.svc.eqiad.wmnet,eventstreams.svc.eqiad.wmnet,eventstreams-internal.svc.eqiad.wmnet,linkrecommendation.svc.eqiad.wmnet,mathoid.svc.eqiad.wmnet,mobileapps.svc.eqiad.wmnet,mwdebug.svc.eqiad.wmnet,proton.svc.eqiad.wmnet,push-notifications.svc.eqiad.wmnet,recommendation-api.svc.eqiad.wmnet,sessionstore.svc.eqiad.wmnet,shellbox.svc.eqiad.wmnet,shellbox-constraints.svc.eqiad.wmnet,shellbox-media.svc.eqiad.wmnet,shellbox-syntaxhighlight.svc.eqiad.wmnet,shellbox-timeline.svc.eqiad.wmnet,similar-users.svc.eqiad.wmnet,tegola-vector-tiles.svc.eqiad.wmnet,termbox.svc.eqiad.wmnet,toolhub.svc.eqiad.wmnet,wikifeeds.svc.eqiad.wmnet,zotero.svc.eqiad.wmnet'
  • depool all Kubernetes services in eqiad (except mwdebug and toolhub):
confctl --object-type discovery select 'name=eqiad,dnsdisc=(apertium|api-gateway|apple-search|blubberoid|citoid|cxserver|echostore|eventgate-analytics|eventgate-analytics-external|eventgate-logging-external|eventstreams|eventstreams-internal|linkrecommendation|mathoid|mobileapps|proton|push-notifications|recommendation-api|sessionstore|shellbox|shellbox-constraints|shellbox-media|shellbox-syntaxhighlight|shellbox-timeline|similar-users|tegola-vector-tiles|termbox|wikifeeds|zotero)' set/pooled=false
  • validate services are fine and served by codfw only
  • dump all manifests from flink to save configmaps not needed
  • create lock for mwdebug auto deploy service flock /var/lib/deploy-mwdebug/flock sleep infinity
  • run redeploy: P17693
  • validate service redeploy went fine in eqiad
  • Switch services to both datacenters again:
confctl --object-type discovery select 'name=eqiad,dnsdisc=(apertium|api-gateway|apple-search|blubberoid|citoid|cxserver|echostore|eventgate-analytics|eventgate-analytics-external|eventgate-logging-external|eventstreams|eventstreams-internal|linkrecommendation|mathoid|mobileapps|proton|push-notifications|recommendation-api|sessionstore|shellbox|shellbox-constraints|shellbox-media|shellbox-syntaxhighlight|shellbox-timeline|similar-users|tegola-vector-tiles|termbox|wikifeeds|zotero)' set/pooled=true
  • bump environment state value helmBinary to helm3 (see 741681)
  • remove lock /var/lib/deploy-mwdebug/flock

Feel free to add any thoughts or additional steps in case I missed something.

Change 741681 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] hiera::role::common::deployment_server update helmBinary eqiad

https://gerrit.wikimedia.org/r/741681

Mentioned in SAL (#wikimedia-operations) [2021-11-25T07:09:49Z] <jelto> start re-deploy procedure in eqiad Kubernetes T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-25T07:10:00Z] <jelto> downtime PyBal backends health check on lvs1015 and lvs1016 for helm3 de-deploy T251305. I'm keeping an eye on icing and remove downtime as soon as I'm finished

Mentioned in SAL (#wikimedia-operations) [2021-11-25T07:17:22Z] <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on 32 hosts with reason: helm3 de-deploy T251305

Mentioned in SAL (#wikimedia-operations) [2021-11-25T07:17:46Z] <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 32 hosts with reason: helm3 de-deploy T251305

Change 741681 merged by Jelto:

[operations/puppet@production] hiera::role::common::deployment_server update helmBinary eqiad

https://gerrit.wikimedia.org/r/741681

cc from ops list:

The re-deploy for all services in the eqiad Kubernetes cluster was successful. However this time we had an impact on service availability. Planned reduced service availability happened for mwdebug and toolhub, which are only available on eqiad. These services weren't available for around 3 minutes. Unplanned reduced service availability happened for eventgate-main service due to a pooling mistake on my side. eventgate-main service was not available between 7:32 and 7:35 UTC and generated around ~25k exceptions (see Grafana).

All Kubernetes environments are running with helm3 now. For day-to-day deployments nothing should change for you. For low-level troubleshooting please keep in mind to use helm3 client instead of helm for the next time (until cleanup happened).

I'll proceed with cleanup steps (see task description) next week. You may also see some related follow-up tasks.

Change 742989 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] admin_ng: remove tiller

https://gerrit.wikimedia.org/r/742989

Change 742989 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: remove tiller

https://gerrit.wikimedia.org/r/742989

First cleanup task is finished:

  • remove tiller and tiller service accounts (742989)

Tiller deployments and RBAC resources are removed from all Kubernetes environments (staging, codfw and eqiad).

cd /srv/deployment-charts/helmfile.d/admin_ng
kube_env admin staging-codfw
helmfile -e staging-codfw diff
helmfile -e staging-codfw -l name=namespaces apply # only namespaces because jayme is working on cert-manager 
helmfile -e staging-codfw -l name=rbac-rules apply # only rbac-rules because jayme is working on cert-manager

kube_env admin staging
helmfile -e staging-eqiad diff
helmfile -e staging-eqiad apply

kube_env admin codfw
helmfile -e codfw diff
helmfile -e codfw apply

kube_env admin eqiad
helmfile -e eqiad diff
helmfile -e eqiad apply

Change 746864 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] Rakefile: remove helm2 from Rakefile

https://gerrit.wikimedia.org/r/746864

Change 747147 had a related patch set uploaded (by Jelto; author: Jelto):

[integration/config@master] helm-linter: remove helm2 from Docker image

https://gerrit.wikimedia.org/r/747147

Change 747148 had a related patch set uploaded (by Jelto; author: Jelto):

[integration/config@master] jjb: update helm-linter job to releng/helm-linter:0.3.0

https://gerrit.wikimedia.org/r/747148

Change 747147 merged by jenkins-bot:

[integration/config@master] helm-linter: remove helm2 from Docker image

https://gerrit.wikimedia.org/r/747147

Change 747148 merged by jenkins-bot:

[integration/config@master] jjb: update helm-linter job to releng/helm-linter:0.3.0

https://gerrit.wikimedia.org/r/747148

Change 747460 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/deployment-charts@master] Update utils.rb's helm_version function

https://gerrit.wikimedia.org/r/747460

Change 747460 abandoned by Elukey:

[operations/deployment-charts@master] Update utils.rb's helm_version function

Reason:

https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/746864

https://gerrit.wikimedia.org/r/747460

Change 746864 merged by jenkins-bot:

[operations/deployment-charts@master] Rakefile: remove helm2 from Rakefile, bump scaffold to v2 api

https://gerrit.wikimedia.org/r/746864

Change 747487 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] Rakefile/rake_modules: remove unused function helm_version() and cleanup

https://gerrit.wikimedia.org/r/747487

The removal of tiller has broken PipelineLib's deploy functionality. For example, https://integration.wikimedia.org/ci/job/blubber-pipeline-rehearse/84/console

We'll need to refactor PipelineLib to use helm3 ASAP. Filing a task.

Change 747819 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] helmfile.d/admin_ng: fix subjects of rolebinding in namespaces

https://gerrit.wikimedia.org/r/747819

Change 747487 merged by jenkins-bot:

[operations/deployment-charts@master] Rakefile/rake_modules: remove unused function helm_version() and cleanup

https://gerrit.wikimedia.org/r/747487

Change 747819 merged by jenkins-bot:

[operations/deployment-charts@master] helmfile.d/admin_ng: fix subjects of rolebinding in namespaces

https://gerrit.wikimedia.org/r/747819

Change 748701 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] Rakefile: check only client helm version

https://gerrit.wikimedia.org/r/748701

Change 748701 merged by jenkins-bot:

[operations/deployment-charts@master] Rakefile: check only client helm version

https://gerrit.wikimedia.org/r/748701

Change 751067 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] deployment_server: remove obsolete value helmBinary

https://gerrit.wikimedia.org/r/751067

Change 751070 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] charts: update charts to api v2

https://gerrit.wikimedia.org/r/751070

Change 751120 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] changeprop/eventgate: bump kafka-dev dependencie to 1.0.0

https://gerrit.wikimedia.org/r/751120

Change 737034 merged by jenkins-bot:

[operations/deployment-charts@master] services: cleanup helmfiles, update SAL logging

https://gerrit.wikimedia.org/r/737034

Change 751067 merged by Jelto:

[operations/puppet@production] deployment_server: remove obsolete value helmBinary

https://gerrit.wikimedia.org/r/751067

Change 751070 merged by jenkins-bot:

[operations/deployment-charts@master] charts: update charts to api v2

https://gerrit.wikimedia.org/r/751070

Change 751120 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop/eventgate: bump kafka-dev dependencie to 0.1.0

https://gerrit.wikimedia.org/r/751120

Change 753026 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] deployment_server,::helm: remove helm2 support

https://gerrit.wikimedia.org/r/753026

Change 753026 merged by Jelto:

[operations/puppet@production] deployment_server,::helm: remove helm2 support

https://gerrit.wikimedia.org/r/753026

I removed helm2 from deploy1001 and deploy2001 by merging https://gerrit.wikimedia.org/r/753026. I tested the removal before on WMCS and a temporary pontoon setup before (see details here).

The removal of the systemd timer helm-repo-update.timer failed on the machines deploy1001 and contint2001 due to some race condition. The puppet execution, which removed the timer components ran during the execution of the helm-repo-update.timer. So the timer failed with:

systemctl status helm-repo-update.timer
● helm-repo-update.timer
   Loaded: not-found (Reason: Unit helm-repo-update.timer not found.)
   Active: failed (Result: resources) since Wed 2022-01-12 14:35:50 UTC; 19min ago
  Trigger: n/a

Jan 12 14:35:50 deploy1002 systemd[1]: helm-repo-update.timer: Failed to queue unit startup job: Unit helm-repo-update.service not found

I executed sudo systemctl reset-failed helm-repo-update.timer manually on deploy1001 and contint2001, because additional puppet runs could not clean up the obsolete/unmanaged timer entry.

helm2 is now removed from hosts deploy1001, deploy2001, contint1001 and contint2001:

$ helm2 version
-bash: helm2: command not found

helm links to helm3 now:

$ helm version
version.BuildInfo{Version:"v3.6.3", GitCommit:"", GitTreeState:"", GoVersion:"go1.15.9"}

Change 757877 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/deployment-charts@master] charts: remove depricated helm test annotation, fix hook-delete-policy

https://gerrit.wikimedia.org/r/757877

Change 757877 merged by jenkins-bot:

[operations/deployment-charts@master] charts: remove depricated helm test annotation, fix hook-delete-policy

https://gerrit.wikimedia.org/r/757877

With the removal of deprecated helm2 test annotations in https://gerrit.wikimedia.org/r/757877 all (known) cleanup steps are finished. There is some open work regarding automatic testing of services in T276949. But this is not really related to the migration, but was also depended on helm3.

So I'm going to close this task. If you find some helm2 debris, feel free to re-open the task and link it here.

Thanks all for the support!

Change 784227 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] helmfile.d: Remove all reference to tillerNamespace

https://gerrit.wikimedia.org/r/784227

Change 784227 merged by jenkins-bot:

[operations/deployment-charts@master] helmfile.d: Remove all reference to tillerNamespace

https://gerrit.wikimedia.org/r/784227

Change 784791 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] kubernetes::deployment_server: add new service image-suggestion

https://gerrit.wikimedia.org/r/784791

Change #1019809 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/docker-images/production-images@master] Remove the tiller image

https://gerrit.wikimedia.org/r/1019809

Change #1019809 merged by JMeybohm:

[operations/docker-images/production-images@master] Remove the tiller image

https://gerrit.wikimedia.org/r/1019809