Page MenuHomePhabricator

Revisit Toolforge automated package updates and version pinnings
Open, LowPublic

Description

We have two Toolforge-specific differences to how everything else in Cloud VPS does package updates:

  • No unattended updates for distro-wikimedia or distro-tools repos
  • Pinned versions for Kernel and sssd (profile::toolforge::apt_pinning)

Both of them are causing our systems to run on outdated software versions.

Lack of unattended updates for our own repos alone has caused a large amount of outdated packages (list is outdated versions present on at least one VM). T181647 says the clinic duty person should be manually updating things.. but I don't think anyone is.

taavi@tools-clushmaster-02:~ $ clush -w @all -N 'sudo apt-upgrade -un report' | sort -k3 | uniq
buster-wikimedia/main: prometheus-rsyslog-exporter 0.0.0+git20201008-1 --> 0.0.0+git20201008-3
stretch-wikimedia/main: prometheus-rsyslog-exporter 0.0.0+git20201008-1 --> 0.0.0+git20201008-3
stretch-wikimedia/main: python3-prometheus-client 0.0.18-1 --> 0.6.0-1~wmf9u1
stretch-wikimedia/main: python-prometheus-client 0.0.18-1 --> 0.6.0-1~wmf9u1
buster-wikimedia/main: python3-wmflib 0.0.6-1+deb10u1 --> 0.0.9-1+deb10u1
buster-wikimedia/main: python3-wmflib 0.0.7-1+deb10u1 --> 0.0.9-1+deb10u1
buster-wikimedia/main: python3-wmflib 0.0.8-1+deb10u1 --> 0.0.9-1+deb10u1
stretch-wikimedia/main: cloud-init 0.7.9-2+deb9u1 --> 20.2-2~bpo10+1
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: kubectl 1.17.13-00 --> 1.19.13-00
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: kubectl 1.17.17-00 --> 1.19.13-00
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: containerd.io 1.2.13-2 --> 1.4.8-1
buster-tools/main: jobutils 1.41 --> 1.42
buster-tools/main: misctools 1.41 --> 1.42
stretch-tools/main: jobutils 1.41 --> 1.42
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: docker-ce-cli 5:18.09.9~3-0~debian-stretch --> 5:20.10.7~3-0~debian-buster
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: docker-ce-cli 5:19.03.14~3-0~debian-stretch --> 5:20.10.7~3-0~debian-buster
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: docker-ce 5:19.03.5~3-0~debian-stretch --> 5:20.10.7~3-0~debian-buster
buster-wikimedia/thirdparty/kubeadm-k8s-1-19: docker-ce-cli 5:19.03.5~3-0~debian-stretch --> 5:20.10.7~3-0~debian-buster
buster-wikimedia/thirdparty/elastic74: elasticsearch-curator 5.2.0-1 --> 5.8.1
oldstable/main: python-elasticsearch-curator 5.2.0-1 --> [remove]
stretch-wikimedia/main: puppet 5.5.10-2~deb9u2 --> 5.5.22-1+deb9u1
buster-wikimedia/main: puppet 5.5.10-4 --> 5.5.22-1
buster-wikimedia/component/jdk8: openjdk-8-jdk 8u242-b08-1~deb10u1 --> 8u302-b08-1~deb10u1
buster-wikimedia/component/jdk8: openjdk-8-jdk-headless 8u242-b08-1~deb10u1 --> 8u302-b08-1~deb10u1
buster-wikimedia/component/jdk8: openjdk-8-jre 8u242-b08-1~deb10u1 --> 8u302-b08-1~deb10u1
buster-wikimedia/component/jdk8: openjdk-8-jre-headless 8u242-b08-1~deb10u1 --> 8u302-b08-1~deb10u1

I'm not sure about the origins of Kernel and sssd pinning, but I can't really think of any reasons why we want them - I'd much rather take the security and bugfix updates that are released to (old)stable than take the additional stability coming from not updating.

Event Timeline

In our team meeting:

  • we agreed that we aren't seeing a lot of value today on excluding the wmf repo, we can un-exclude it!
  • we are probably no longer interested in the package pinning in general. We will keep the pinning for the kernel and sssd just in case, but other than that, all other pinning can just be drop unless there is a good reason against

Mentioned in SAL (#wikimedia-cloud) [2021-09-09T16:50:05Z] <majavah> enable unattended updates on toolsbeta T290494

Mentioned in SAL (#wikimedia-cloud) [2022-05-10T13:54:36Z] <taavi> enable distro-wikimedia unattended upgrades T290494

Just enabled unattended-upgrades. That still leaves the apt pinnings. Also note that the kernel pinning does not work for new hosts (all bullseye hosts and some buster ones) use the cloud kernel variants, which our pinnings don't seem to apply to.

Oh, the kernel pinnings don't work at all. This is an older host that does not use the cloud image

taavi@tools-k8s-worker-42:~ $ uname -a
Linux tools-k8s-worker-42 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux
taavi@tools-k8s-control-2:~ $ kubectl sudo get node tools-k8s-worker-42 -o wide
NAME                  STATUS                     ROLES    AGE     VERSION    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION    CONTAINER-RUNTIME
tools-k8s-worker-42   Ready,SchedulingDisabled   <none>   2y72d   v1.20.11   172.16.1.74   <none>        Debian GNU/Linux 10 (buster)   4.19.0-14-amd64   docker://20.10.8

taavi@tools-k8s-worker-42:~ $ uname -a
Linux tools-k8s-worker-42 4.19.0-20-amd64 #1 SMP Debian 4.19.235-1 (2022-03-17) x86_64 GNU/Linux
taavi@tools-k8s-control-2:~ $ kubectl sudo get node tools-k8s-worker-42 -o wide
NAME                  STATUS                     ROLES    AGE     VERSION    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION    CONTAINER-RUNTIME
tools-k8s-worker-42   Ready,SchedulingDisabled   <none>   2y72d   v1.20.11   172.16.1.74   <none>        Debian GNU/Linux 10 (buster)   4.19.0-20-amd64   docker://20.10.8

Since no-one ever noticed that I think I'm just going to send a patch to remove the kernel pinnings.

Change 790710 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: remove linux kernel pinnings

https://gerrit.wikimedia.org/r/790710

In T290494#7917947, @Majavah wrote:

Oh, the kernel pinnings don't work at all. This is an older host that does not use the cloud image

No, I was just looking at a host that does not use profile::toolforge::apt_pinning. This means that pretty much only the stretch grid has those pinnings applied, which means we can remove them after the stretch grid is gone.

Change 790710 merged by David Caro:

[operations/puppet@production] P:toolforge: remove linux kernel pinnings

https://gerrit.wikimedia.org/r/790710

Change 929016 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::apt_pinning: remove unused pinnings

https://gerrit.wikimedia.org/r/929016

Change 929159 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge::apt_pinning: remove sssd pinning

https://gerrit.wikimedia.org/r/929159

Change 929016 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] P:toolforge::apt_pinning: remove unused pinnings

https://gerrit.wikimedia.org/r/929016

Change 929159 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] P:toolforge::apt_pinning: remove sssd pinning

https://gerrit.wikimedia.org/r/929159