Page MenuHomePhabricator

Upgrade Traffic hosts to bullseye
Open, MediumPublic

Description

This task tracks the upgrade of the Traffic hosts to bullseye, affecting the services below (identified by their cumin aliases). There is no particular order but we will be doing the cp hosts first to resolve T319067.

  • A:cp

Debian packages upgraded:

  • varnish_6.0.10-1wm2
  • trafficserver_9.1.3-1wm3
  • fifo-log-demux_0.6.3
  • file-read-backwards_2.0.0-3
  • prometheus-rdkafka-exporter_0.3
  • python-logstash_0.4.6-3
  • prometheus-varnishkafka-exporter_0.1-2
  • varnishkafka_1.1.0-2
  • libvmod-netmapper_1.9-2
  • libvmod-querysort_0.3
  • purged_0.19
  • libvmod-re2_1.5.3-3
  • varnish-modules_0.15.0-2

cp hosts running bullseye:

  • cp2041.codfw.wmnet
  • cp2042.codfw.wmnet
  • A:dns-auth
  • A:dns-rec
  • A:acmechief

Debian packages upgraded:

  • acme-chief_0.36-1
  • A:ncredir
  • A:lvs

Wikidough and durum are covered by their own separate task: T305589.

This is meant to be an umbrella task for all changes that will be part of this upgrade, such as the Debian packaging, Puppet changes, and the related testing include reimaging.

Details

ProjectBranchLines +/-Subject
operations/software/acme-chiefdebian+9 -16
operations/software/acme-chiefdebian+2 -2
operations/software/acme-chiefdebian+24 -79
operations/software/acme-chiefmaster+2 -2
operations/software/acme-chiefmaster+24 -79
integration/configmaster+1 -1
integration/configmaster+8 -2
operations/software/acme-chiefdebian+12 -17
integration/configmaster+1 -0
operations/puppetproduction+5 -5
operations/puppetproduction+2 -0
operations/debs/varnish-modulesmaster+466 -0
operations/puppetproduction+1 -1
operations/puppetproduction+2 -1
operations/puppetproduction+4 -24
operations/puppetproduction+6 -6
operations/software/varnish/libvmod-re2debian-6.0+93 -3
operations/software/purgedmaster+468 -30
operations/software/varnish/libvmod-re2debian-6.0+15 -5
operations/software/varnish/libvmod-querysortmain+10 -1
operations/software/varnish/libvmod-netmapperdebian+18 -5
operations/software/varnish/varnishkafkadebian+13 -4
operations/puppetproduction+0 -25
operations/debs/varnish4debian-wmf+12 -3
operations/puppetproduction+13 -0
operations/debs/prometheus-varnishkafka-exportermaster+13 -3
operations/debs/file-read-backwardsdebian+13 -9
operations/software/prometheus-rdkafka-exportermaster+11 -3
operations/debs/python-logstashmaster+13 -16
operations/software/fifo-log-demuxmaster+13 -3
operations/puppetproduction+1 -0
operations/debs/trafficservermaster+13 -2
operations/puppetproduction+1 -1
operations/puppetproduction+11 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 852234 merged by Ssingh:

[operations/debs/file-read-backwards@debian] Release 2.0.0-3

https://gerrit.wikimedia.org/r/852234

Change 852897 merged by Ssingh:

[operations/puppet@production] package_builder: add hook for varnish6 (bullseye)

https://gerrit.wikimedia.org/r/852897

Change 852886 merged by Ssingh:

[operations/debs/prometheus-varnishkafka-exporter@master] Release 0.1-2

https://gerrit.wikimedia.org/r/852886

Change 849644 merged by Ssingh:

[operations/debs/varnish4@debian-wmf] Release 6.0.10-1wm2

https://gerrit.wikimedia.org/r/849644

Mentioned in SAL (#wikimedia-operations) [2022-11-03T15:54:40Z] <sukhe> sudo -i reprepro -C main include bullseye-wikimedia varnish_6.0.10-1wm2_amd64.changes: T321309

Mentioned in SAL (#wikimedia-operations) [2022-11-03T16:18:03Z] <sukhe> reprepro -C main include bullseye-wikimedia trafficserver_9.1.3-1wm3_amd64.changes: T321309

Mentioned in SAL (#wikimedia-operations) [2022-11-04T13:09:56Z] <sukhe> reprepro -C main include bullseye-wikimedia fifo-log-demux_0.6.3_amd64.changes: T321309

Mentioned in SAL (#wikimedia-operations) [2022-11-04T13:10:39Z] <sukhe> reprepro -C main include bullseye-wikimedia file-read-backwards_2.0.0-3_amd64.changes: T321309

Mentioned in SAL (#wikimedia-operations) [2022-11-04T13:11:28Z] <sukhe> reprepro -C main include bullseye-wikimedia prometheus-rdkafka-exporter_0.3_amd64.changes: T321309

Mentioned in SAL (#wikimedia-operations) [2022-11-04T13:17:04Z] <sukhe> reprepro -C main include bullseye-wikimedia python-logstash_0.4.6-3_amd64.changes: T321309

Change 853954 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] package_builder: remove deprecated Varnish6 hooks

https://gerrit.wikimedia.org/r/853954

Mentioned in SAL (#wikimedia-operations) [2022-11-07T12:13:53Z] <sukhe> reprepro -C main include bullseye-wikimedia prometheus-varnishkafka-exporter_0.1-2_amd64.changes: T321309

Change 853962 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/varnish/libvmod-netmapper@debian] Release 1.9-2

https://gerrit.wikimedia.org/r/853962

Change 853967 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/purged@master] Release 0.19

https://gerrit.wikimedia.org/r/853967

Change 853974 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/varnish/libvmod-re2@debian-6.0] Release 1.5.3-2

https://gerrit.wikimedia.org/r/853974

Change 853987 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/varnish/varnishkafka@debian] Release 1.1.0-2

https://gerrit.wikimedia.org/r/853987

Change 853954 merged by Ssingh:

[operations/puppet@production] package_builder: remove deprecated Varnish6 hooks

https://gerrit.wikimedia.org/r/853954

Change 853987 merged by Ssingh:

[operations/software/varnish/varnishkafka@debian] Release 1.1.0-2

https://gerrit.wikimedia.org/r/853987

Mentioned in SAL (#wikimedia-operations) [2022-11-07T15:09:36Z] <sukhe> reprepro -C main include bullseye-wikimedia varnishkafka_1.1.0-2_amd64.changes: T321309

Change 853962 merged by Ssingh:

[operations/software/varnish/libvmod-netmapper@debian] Release 1.9-2

https://gerrit.wikimedia.org/r/853962

Mentioned in SAL (#wikimedia-operations) [2022-11-07T15:18:54Z] <sukhe> reprepro -C main include bullseye-wikimedia libvmod-netmapper_1.9-2_amd64.changes: T321309

Change 854028 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/varnish/libvmod-querysort@main] Release 0.3

https://gerrit.wikimedia.org/r/854028

Change 854028 merged by Ssingh:

[operations/software/varnish/libvmod-querysort@main] Release 0.3

https://gerrit.wikimedia.org/r/854028

Mentioned in SAL (#wikimedia-operations) [2022-11-07T15:44:16Z] <sukhe> reprepro -C main include bullseye-wikimedia libvmod-querysort_0.3_amd64.changes: T321309

Change 853974 merged by Ssingh:

[operations/software/varnish/libvmod-re2@debian-6.0] Release 1.5.3-2

https://gerrit.wikimedia.org/r/853974

Change 853967 merged by Vgutierrez:

[operations/software/purged@master] Release 0.19

https://gerrit.wikimedia.org/r/853967

Mentioned in SAL (#wikimedia-operations) [2022-11-07T17:22:45Z] <sukhe> reprepro -C main include bullseye-wikimedia purged_0.19_amd64.changes: T321309

Change 854063 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/varnish/libvmod-re2@debian-6.0] Release 1.5.3-3

https://gerrit.wikimedia.org/r/854063

Change 854063 merged by Ssingh:

[operations/software/varnish/libvmod-re2@debian-6.0] Release 1.5.3-3

https://gerrit.wikimedia.org/r/854063

Mentioned in SAL (#wikimedia-operations) [2022-11-08T12:27:19Z] <sukhe> reprepro -C main include bullseye-wikimedia libvmod-re2_1.5.3-3_amd64.changes: T321309

Change 854607 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] varnish::common: set Python version for bullseye

https://gerrit.wikimedia.org/r/854607

Change 854608 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] sslcert: refactor update-ocsp.py to Python 3

https://gerrit.wikimedia.org/r/854608

The install of the updated trafficserver package for bullseye fails on a bullseye host with the message:

The following packages have unmet dependencies:
 trafficserver : Depends: libhwloc15 (>= 2.8.0) but 2.4.1+dfsg-1 is to be installed
E: Unable to correct problems, you have held broken packages.

This happens because as per the packaging instructions, we build ATS with BACKPORTS=yes. And because of this, the version 2.8.0-1~bpo11+1 in bullseye-backports takes precedence over 2.4.1+dfsg-1 in bullseye. (See the D02backports pbuilder hook).

I am not aware of the reasons why we build with BACKPORT=yes but just to confirm that there are no other differences:

$ debdiff trafficserver-backport.deb trafficserver-nobackports.deb  
File lists identical (after any substitutions)

Control files: lines which differ (wdiff format)
------------------------------------------------
Depends: libbrotli1 (>= 0.6.0), libc6 (>= 2.29), libcap2 (>= 1:2.10), libcurl4 (>= 7.16.2), libgcc-s1 (>= 3.4), libhwloc15 (>= [-2.8.0),-] {+2.4.1+dfsg),+} libluajit-5.1-2 (>= 2.0.4+dfsg), liblzma5 (>= 5.1.1alpha+20110809), libmaxminddb0 (>= 1.0.2), libncursesw6 (>= 6), libpcre3, libssl1.1 (>= 1.1.1), libstdc++6 (>= 9), libtinfo6 (>= 6), libunwind8, zlib1g (>= 1:1.2.0), lsb-base, adduser, perl:any

This means that it should be safe to build without the backports and that should help us alleviate the issue with the failing trafficserver install.

I am not aware of the reasons why we build with BACKPORT=yes but just to confirm that there are no other differences:

Probably because some dependency needed wasn't in plain buster. But by backport policies every package landing in buster-backports must be in testing, so there's no version in buster-backports, which isn't also in bullseye. As such, you can safely build without BACKPORTS=yes and remove it from the docs for bullseye.

I am not aware of the reasons why we build with BACKPORT=yes but just to confirm that there are no other differences:

Probably because some dependency needed wasn't in plain buster. But by backport policies every package landing in buster-backports must be in testing, so there's no version in buster-backports, which isn't also in bullseye. As such, you can safely build without BACKPORTS=yes and remove it from the docs for bullseye.

Thanks for confirming Moritz!

Mentioned in SAL (#wikimedia-operations) [2022-11-09T14:43:25Z] <sukhe> reprepro remove bullseye-wikimedia trafficserver: T321309

Change 854608 merged by Ssingh:

[operations/puppet@production] sslcert: refactor update-ocsp.py to Python 3

https://gerrit.wikimedia.org/r/854608

Change 854607 merged by Ssingh:

[operations/puppet@production] varnish::common: set Python version for bullseye

https://gerrit.wikimedia.org/r/854607

Change 855991 had a related patch set uploaded (by Ssingh; author: Ssingh):

[integration/config@master] zuul: configure CI for operations/debs/varnish-modules

https://gerrit.wikimedia.org/r/855991

Change 855996 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/debs/varnish-modules@master] Release 0.15.0-2

https://gerrit.wikimedia.org/r/855996

@ssingh Hi! I am seeing the following error for update-ocsp-all on various cp nodes:

Nov 13 07:23:00 cp1077 update-ocsp-all[42808]: /usr/local/sbin/update-ocsp:52: DeprecationWarning: The SafeConfigParser class has been renamed to ConfigParser in Python 3.2. This alias will be removed in future versions. Use ConfigParser directly instead.
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:   config = configparser.SafeConfigParser()
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]: Traceback (most recent call last):
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:   File "/usr/local/sbin/update-ocsp", line 290, in <module>
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:     main()
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:   File "/usr/local/sbin/update-ocsp", line 283, in main
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:     certs_fetch_ocsp(out_tempfile, args)
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:   File "/usr/local/sbin/update-ocsp", line 170, in certs_fetch_ocsp
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:     issuer_path = cert_get_issuer_filename(certs[0], cadir)
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:   File "/usr/local/sbin/update-ocsp", line 136, in cert_get_issuer_filename
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:     issuer_subject = cert_x509_option_kv(cert, "issuer")
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:   File "/usr/local/sbin/update-ocsp", line 124, in cert_x509_option_kv
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]:     k, v = cert_x509_option(filename, attrib).split("=", 1)
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]: TypeError: a bytes-like object is required, not 'str'
Nov 13 07:23:00 cp1077 update-ocsp-all[42808]: OCSP update failed for /etc/update-ocsp.d/digicert-2022-rsa-unified.conf

Seems to be one of the usual annoyances from Py2 -> Py3 porting, maybe related to https://gerrit.wikimedia.org/r/c/operations/puppet/+/854608 ?

Change 856126 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] sslcert: add decode() after popen's communicate toupdate-ocsp.py

https://gerrit.wikimedia.org/r/856126

Change 856126 merged by Vgutierrez:

[operations/puppet@production] sslcert: add text=True to Popen for update-ocsp.py

https://gerrit.wikimedia.org/r/856126

Change 856483 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/puppet@production] sslcert::update-ocsp: Stop using SafeConfigParser

https://gerrit.wikimedia.org/r/856483

Change 856483 merged by Vgutierrez:

[operations/puppet@production] sslcert::update-ocsp: Stop using SafeConfigParser

https://gerrit.wikimedia.org/r/856483

Change 855996 merged by Ssingh:

[operations/debs/varnish-modules@master] Release 0.15.0-2

https://gerrit.wikimedia.org/r/855996

Mentioned in SAL (#wikimedia-operations) [2022-11-14T16:03:16Z] <sukhe> reprepro -C main include bullseye-wikimedia varnish-modules_0.15.0-2_amd64.changes: T321309

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2042.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2042.codfw.wmnet with OS bullseye executed with errors:

  • cp2042 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details

Change 857562 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Pull in the fdisk-udeb in d-i

https://gerrit.wikimedia.org/r/857562

Change 857562 merged by Muehlenhoff:

[operations/puppet@production] Pull in the fdisk-udeb in d-i

https://gerrit.wikimedia.org/r/857562

Change 857623 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Update check_fresh_files_in_dir for python3

https://gerrit.wikimedia.org/r/857623

Change 857623 merged by Vgutierrez:

[operations/puppet@production] monitoring: Update check_fresh_files_in_dir for python3

https://gerrit.wikimedia.org/r/857623

Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp2041.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp2041.codfw.wmnet with OS bullseye executed with errors:

  • cp2041 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2041.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2041.codfw.wmnet with OS bullseye executed with errors:

  • cp2041 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp2041.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp2041.codfw.wmnet with OS bullseye executed with errors:

  • cp2041 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2041.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2041.codfw.wmnet with OS bullseye executed with errors:

  • cp2041 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2041.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2041.codfw.wmnet with OS bullseye executed with errors:

  • cp2041 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details

Change 855991 merged by jenkins-bot:

[integration/config@master] zuul: configure CI for operations/debs/varnish-modules

https://gerrit.wikimedia.org/r/855991

Mentioned in SAL (#wikimedia-releng) [2022-11-23T15:07:38Z] <James_F> Zuul: configure CI for operations/debs/varnish-modules for T321309

Change 860612 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/acme-chief@debian] Release 0.35-2

https://gerrit.wikimedia.org/r/860612

Change 860612 abandoned by Ssingh:

[operations/software/acme-chief@debian] Release 0.35-2

Reason:

following up with a commit for the setup.py dependencies first

https://gerrit.wikimedia.org/r/860612

Change 860637 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/software/acme-chief@master] setup.py: update dependencies for bullseye

https://gerrit.wikimedia.org/r/860637

Change 860839 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[integration/config@master] dockerfiles: Use tox-buster on tox-acme-chief

https://gerrit.wikimedia.org/r/860839

Change 860847 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[integration/config@master] jjb: Use tox-acme-chief:0.6.1 on acme-chief job

https://gerrit.wikimedia.org/r/860847

Change 860839 merged by jenkins-bot:

[integration/config@master] dockerfiles: Use tox-buster on tox-acme-chief

https://gerrit.wikimedia.org/r/860839

Change 860847 merged by jenkins-bot:

[integration/config@master] jjb: Use tox-acme-chief:0.7.0 on acme-chief job

https://gerrit.wikimedia.org/r/860847

Change 860637 merged by Vgutierrez:

[operations/software/acme-chief@master] setup.py: update dependencies for bullseye

https://gerrit.wikimedia.org/r/860637

Change 863028 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/software/acme-chief@master] Release 0.36

https://gerrit.wikimedia.org/r/863028

Change 863028 merged by Vgutierrez:

[operations/software/acme-chief@master] Release 0.36

https://gerrit.wikimedia.org/r/863028

Change 863231 had a related patch set uploaded (by Vgutierrez; author: Ssingh):

[operations/software/acme-chief@debian] setup.py: update dependencies for bullseye

https://gerrit.wikimedia.org/r/863231

Change 863232 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/software/acme-chief@debian] Release 0.36

https://gerrit.wikimedia.org/r/863232

Change 863233 had a related patch set uploaded (by Vgutierrez; author: Vgutierrez):

[operations/software/acme-chief@debian] debian: Add release 0.36 to changelog

https://gerrit.wikimedia.org/r/863233

Change 863231 merged by Vgutierrez:

[operations/software/acme-chief@debian] setup.py: update dependencies for bullseye

https://gerrit.wikimedia.org/r/863231

Change 863232 merged by Vgutierrez:

[operations/software/acme-chief@debian] Release 0.36

https://gerrit.wikimedia.org/r/863232

Change 863233 merged by Vgutierrez:

[operations/software/acme-chief@debian] debian: Add release 0.36 to changelog

https://gerrit.wikimedia.org/r/863233

Mentioned in SAL (#wikimedia-operations) [2022-12-02T10:01:46Z] <vgutierrez> upload acme-chief 0.36 to apt.wm.o (bullseye) - T321309

One thing to keep in mind for the LVSes is that Bullseye only includes Python 2 as a build dependency (at the time of the release some crucial packages (most notably Chromium and the QtWebKit engine derived from it still needed Python 2 for the build system), but not at run time. Currently we absent Python2 on https://github.com/wikimedia/puppet/blob/production/modules/base/manifests/standard_packages.pp#L60

So moving the LVSes in the current stack will either require T200319 or instead we'd need to build/import some of the Python 2 reverse dependencies which are no longer found in Bullseye (such as twisted and pyopenssl) in a separate component.