Page MenuHomePhabricator

upgrade to ingress-nginx 1.0
Closed, ResolvedPublic

Description

ingress-nginx 1.0 is out: https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.0.0

We should upgrade toolforge and paws (both deploy it via helm) to it.

Event Timeline

Warning, ingress nginx 1.0 will refuse to work with extensions/v1beta1 ingresses regardless of cluster version. @mdipietro and I figured this out experimenting with T291589: Upgrade paws jupyterhub. That should be no issue for most tools per se, as long as it reads existing ingress objects (worth checking if any still exist...they probably do), but jupyterhub 0.9.0 still uses that ingress version.

Toolsbeta should be fine, tools *might* be ok and PAWS just needs jupyterhub upgraded first.

I think networking.k8s.io/v1beta1 might be ok...just definitely not extensions/v1beta1.

We deploy ingress-nginx via Helm, as of right now it's the only component deployed with Helm and not been upgraded since we moved it to Helm, making this a fully new procedure. Without having looked very far into this, I think we have a few major issues with our Helm tooling, namely no way of diffing the changes and not having very good control of what version we are upgrading to.

The current documentation just advises to run a blind upgrade command. In theory that's fine, but we probably want to modify it to upgrade to a specific version (and instructions how to look up the latest version). Using helmfile (which is already packaged on apt.wm.o) as of right now is probably overkill, but worth considering if we deploy more components via Helm. Diffs are an easier problem, there's a plugin (helm-diff) that wiki production uses too and should be just be a matter of installing a relevant package.

Change 729577 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] kubeadm: add helm-diff

https://gerrit.wikimedia.org/r/729577

Change 729577 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] kubeadm: add helm-diff

https://gerrit.wikimedia.org/r/729577

Mentioned in SAL (#wikimedia-cloud) [2021-10-25T14:33:16Z] <majavah> copy nginx-ingress controller v1.0.4 to internal registry T292771

Mentioned in SAL (#wikimedia-cloud) [2021-10-25T14:41:08Z] <majavah> deploy ingress-nginx v1.0.4 to toolsbeta via helm, diff only changes the image T292771

Full chart diff is on P17593 ingress-nginx update

Startup logs show this:

I1025 14:50:07.894107       8 store.go:367] "Ignoring ingress because of error while validating ingress class" ingress="tool-fourohfour/fourohfour" error="ingress does not contain a valid IngressClass"
I1025 14:50:07.894844       8 store.go:367] "Ignoring ingress because of error while validating ingress class" ingress="tool-fourohfour/fourohfour-subdomain" error="ingress does not contain a valid IngressClass"
I1025 14:50:07.895218       8 store.go:367] "Ignoring ingress because of error while validating ingress class" ingress="tool-test/test-subdomain" error="ingress does not contain a valid IngressClass"

Ingress classes are a new resource in K8s 1.19+ and in ingress-nginx v1.0+.

There are two ways to fix this:

  • watchIngressWithoutClass: true on the ingress controller
  • set the ingress class to be default

The second option is more preferrable in my opinion but it doesn't seem to be backwards compatible. So I'll likely do both at first, then get all existing ingresses updated and then set watchIngressWithoutClass: false. Note that apparently you can set ingressClassName on an ingress even if the class does not exist, but the current ingress controller will not like you.

Change 734294 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] toolforge: Update to ingress-nginx v1.0

https://gerrit.wikimedia.org/r/734294

Change 734294 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] toolforge: Update to ingress-nginx v1.0

https://gerrit.wikimedia.org/r/734294

Mentioned in SAL (#wikimedia-cloud) [2021-10-26T12:11:36Z] <majavah> deploy ingress-nginx v1.0.4 / chart v4.0.6 on toolforge T292771

Done on all three clusters.

Majavah added a parent task: Restricted Task.Nov 1 2021, 1:26 PM