Page MenuHomePhabricator

toolforge: cleanup kernel packages
Closed, ResolvedPublic

Description

All nodes in the toolforge cluster have plenty of unused kernel packages, which complicates package upgrades and may occupy considerable amount of space in the filesystem.

Some examples:

aborrero@tools-prometheus-01:~$ dpkg -l | grep linux-image
ii  linux-image-3.16.0-4-amd64      3.16.43-2+deb8u5                 amd64        Linux 3.16 for 64-bit PCs
ii  linux-image-3.16.0-5-amd64      3.16.51-3+deb8u1                 amd64        Linux 3.16 for 64-bit PCs
ii  linux-image-4.4.0-1-amd64       4.4.2-3+wmf3                     amd64        Linux 4.4 for 64-bit PCs
ii  linux-image-4.4.0-2-amd64       4.4.2-3+wmf6                     amd64        Linux 4.4 for 64-bit PCs
ii  linux-image-4.4.0-3-amd64       4.4.2-3+wmf8                     amd64        Linux 4.4 for 64-bit PCs
ii  linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1                  amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-4.9.0-0.bpo.5-amd64 4.9.65-3+deb9u1~bpo8+2           amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-amd64               3.16+63+deb8u1                   amd64        Linux for 64-bit PCs (meta-package)
aborrero@tools-paws-master-01:~$ dpkg -l | grep linux-image
rc  linux-image-4.9.0-3-amd64       4.9.30-2+deb9u5                amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-4.9.0-5-amd64       4.9.65-3+deb9u2                amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-4.9.0-6-amd64       4.9.82-1+deb9u3                amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-amd64               4.9+80+deb9u4                  amd64        Linux for 64-bit PCs (meta-package)
aborrero@tools-exec-1401:~$ dpkg -l | grep ^ii | grep linux-image
ii  linux-image-3.13.0-137-generic                              3.13.0-137.186                             amd64        Linux kernel image for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-3.13.0-139-generic                              3.13.0-139.188                             amd64        Linux kernel image for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-3.13.0-141-generic                              3.13.0-141.190                             amd64        Linux kernel image for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-3.13.0-142-generic                              3.13.0-142.191                             amd64        Linux kernel image for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-extra-3.13.0-137-generic                        3.13.0-137.186                             amd64        Linux kernel extra modules for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-extra-3.13.0-139-generic                        3.13.0-139.188                             amd64        Linux kernel extra modules for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-extra-3.13.0-141-generic                        3.13.0-141.190                             amd64        Linux kernel extra modules for version 3.13.0 on 64 bit x86 SMP
ii  linux-image-generic                                         3.13.0.141.151                             amd64        Generic Linux kernel image
ii  linux-image-virtual                                         3.13.0.142.152                             amd64        This package will always depend on the latest minimal generic kernel image.

Depending on the OS (trusty, jessie, stretch), different versions are installed (and being run), so a common pattern in each should be find to do a cleanup.
This is related to the apt pinning task: T187193

Event Timeline

aborrero triaged this task as High priority.Mar 5 2018, 1:15 PM
aborrero created this task.
aborrero updated the task description. (Show Details)

On jessie machines:

Installed kernel packages:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q jessie /etc/os-release && dpkg -l | grep ^ii | grep linux-image || true" | awk -F' ' '{print $3}' | sort | uniq
linux-image-3.16.0-4-amd64
linux-image-3.16.0-5-amd64
linux-image-3.19.0-1-amd64
linux-image-3.19.0-2-amd64
linux-image-4.2.0-0.bpo.1-amd64
linux-image-4.4.0-1-amd64
linux-image-4.4.0-2-amd64
linux-image-4.4.0-3-amd64
linux-image-4.9.0-0.bpo.3-amd64
linux-image-4.9.0-0.bpo.4-amd64
linux-image-4.9.0-0.bpo.5-amd64
linux-image-4.9.0-0.bpo.6-amd64
linux-image-amd64

Running kernel is the same in all nodes:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q jessie /etc/os-release && uname -vr || true" | sort -n  | awk -F' ' '{print $2" "$6}' | uniq
4.9.0-0.bpo.5-amd64 4.9.65-3+deb9u1~bpo8+2

So we can safely delete, at least:

linux-image-3.16.0-4-amd64
linux-image-3.16.0-5-amd64
linux-image-3.19.0-1-amd64
linux-image-3.19.0-2-amd64
linux-image-4.2.0-0.bpo.1-amd64
linux-image-4.4.0-1-amd64
linux-image-4.4.0-2-amd64
linux-image-4.4.0-3-amd64
linux-image-4.9.0-0.bpo.3-amd64
linux-image-4.9.0-0.bpo.4-amd64
linux-image-4.9.0-0.bpo.6-amd64

hooray for consistency :) +1 to cleanup for old vulnerable kernels, let's try to keep at least N-1 aroundn as we move forward "in case". Thanks man.

Mentioned in SAL (#wikimedia-cloud) [2018-03-05T14:01:26Z] <arturo> deleting old kernel packages in jessie instances for T188911

I just ran this command, which deletes all kernel but linux-image-4.9.0-0.bpo.5-amd64 and linux-image-4.9.0-0.bpo.4-amd64

clush -w @all 'grep -q jessie /etc/os-release && for i in $(dpkg -l | grep ^ii | grep linux-image-[0-9] | cut -d" " -f3) ; do grep -q bpo.[4-5] <<< "$i" || sudo apt-get remove $i -y ; done || true'

So, in jessie machines these are the kernel packages right now:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q jessie /etc/os-release && dpkg -l | grep ^ii | grep linux-image || true" | awk -F' ' '{print $3}' | sort | uniq
linux-image-4.9.0-0.bpo.4-amd64
linux-image-4.9.0-0.bpo.5-amd64

After the cleanup, a report with apt-upgrade shows that indeed linux-image upgrades are legit:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q jessie /etc/os-release && sudo apt-upgrade -n report | grep linux-image || true"
tools-docker-builder-05.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-docker-registry-02.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1012.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1014.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1017.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1015.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1013.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1011.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1016.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1009.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1018.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1002.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1019.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1020.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1027.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-docker-registry-01.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1025.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1023.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1021.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1003.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-logs-02.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1001.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1022.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1006.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1005.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-package-builder-01.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-prometheus-01.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1007.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-proxy-01.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1008.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1004.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-worker-1026.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1 
tools-prometheus-02.tools.eqiad.wmflabs: jessie-backports: linux-image-4.9.0-0.bpo.4-amd64 4.9.51-1~bpo8+1 --> 4.9.65-3+deb9u1~bpo8+1

For stretch machines, kernel package installed:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q stretch /etc/os-release || exit 0 ;  dpkg -l | grep ^ii | grep linux-image" | awk -F' ' '{print $3}' | sort | uniq
linux-image-4.9.0-3-amd64
linux-image-4.9.0-5-amd64
linux-image-4.9.0-6-amd64
linux-image-amd64

Running linux kernel is the same in all the machines:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q stretch /etc/os-release || exit 0 ;  uname -vr" | awk -F' ' '{print $2" "$6}' | sort | uniq
4.9.0-5-amd64 4.9.65-3+deb9u2

We will simply remove linux-image-4.9.0-6-amd64.

Mentioned in SAL (#wikimedia-cloud) [2018-03-05T14:33:58Z] <arturo> delete linux-image-4.9.0-6-amd64 package from stretch instances for T188911

After the operations in the stretch machines, there are no pending kernel upgrades:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q stretch /etc/os-release || exit 0 ; sudo apt-upgrade -n report | grep linux-image || true"
<nothing>

Now with ubuntu,

There are no pending kernel upgrades:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q ubuntu /etc/os-release || exit 0 ;  sudo apt-upgrade -n report | grep linux-image || true" | sort

But there are some packages which doesn't serve any purpose:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q ubuntu /etc/os-release || exit 0 ;  dpkg -l | grep ^ii | grep linux-image-[0-9]" | awk -F' ' '{print $3}' | sort | uniq
linux-image-3.13.0-137-generic
linux-image-3.13.0-139-generic
linux-image-3.13.0-141-generic
linux-image-3.13.0-142-generic

Since there are machines running -141 (our pinning) and others -139, let's just keep these two.

Mentioned in SAL (#wikimedia-cloud) [2018-03-06T11:36:02Z] <arturo> (ubuntu) removed linux-image-3.13.0-142-generic and linux-image-3.13.0-137-generic (T188911)

We are good to go:

aborrero@tools-clushmaster-01:~$ clush -w @all "grep -q ubuntu /etc/os-release || exit 0 ;  dpkg -l | grep ^ii | grep linux-image-[0-9]" | awk -F' ' '{print $3}' | sort | uniq
linux-image-3.13.0-139-generic
linux-image-3.13.0-141-generic