Page MenuHomePhabricator

Replace deployment-imagescaler03 (stretch) with deployment-imagescaler04 (buster)
Open, Needs TriagePublic

Description

In addition to the general desired to get off of stretch, deployment-imagescaler03 is an outlier in deployment-prep in that I can't upgrade scap on it.

Event Timeline

Change 733033 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[operations/puppet@production] thumbor: Remove conditionalization for stretch

https://gerrit.wikimedia.org/r/733033

Change 733033 merged by Alexandros Kosiaris:

[operations/puppet@production] thumbor: Remove conditionalization for stretch

https://gerrit.wikimedia.org/r/733033

Outstanding issues when running puppet agent -t:

#1

Error: /Stage[main]/Thumbor/Package[python-logstash]/ensure: change from 'purged' to 'present' failed: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--for\
ce-confold install python-logstash' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package python-logstash

Referenced by operations/puppet/modules/thumbor/manifests/init.pp:47

Looks like there is a Stretch package in http://apt.wikimedia.org/wikimedia but not one for Buster.

#2

Error: /Stage[main]/Threedtopng::Deploy/Package[nodejs-legacy]/ensure: change from 'purged' to 'present' failed: Execution of '/usr/bin/apt-get -q -y -o DPkg::Optio\
ns::=--force-confold install nodejs-legacy' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
Package nodejs-legacy is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
  nodejs libnode64

E: Package 'nodejs-legacy' has no installation candidate

Referenced by class threedtopng::deploy in operations/puppet/modules/threedtopng/manifests/deploy.pp, referenced by operations/puppet/modules/role/manifests/thumbor/mediawiki.pp

Change 734694 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[operations/puppet@production] threedtopng::deploy: Only install nodejs-legacy on Stretch

https://gerrit.wikimedia.org/r/734694

Change 734694 merged by RLazarus:

[operations/puppet@production] threedtopng::deploy: Only install nodejs-legacy on Stretch

https://gerrit.wikimedia.org/r/734694

Change 734712 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[operations/puppet@production] Thumbor: Choose python-logstash package based on distro

https://gerrit.wikimedia.org/r/734712

Change 734712 merged by Dzahn:

[operations/puppet@production] Thumbor: Choose python-logstash package based on distro

https://gerrit.wikimedia.org/r/734712

puppet agent -t runs without errors on deployment-imagescaler04 now.

Next up: Figure out how to test it.

Using the python3-logstash package didn't work because thumbor itself is using python 2.

Image scalers are still running stretch in production: T216815: Upgrade Thumbor to Buster. If the goal is to do this in prep for upgrading in production, full support from me. But otherwise IMO beta should try to match production right? AIUI thumbor is abandoned in production (and maybe upstream too), and the lack of an owner is blocking the upgrade. I'm mostly unsure of how many other issues you're going to hit along the way too.

deployment-imagescaler03 is an outlier in deployment-prep in that I can't upgrade scap on it.

Is this not an issue in production too then?

Image scalers are still running stretch in production: T216815: Upgrade Thumbor to Buster. If the goal is to do this in prep for upgrading in production, full support from me.

That was not my original motivation but it seemed like a nice side effect.

But otherwise IMO beta should try to match production right?

Functionally yes. I don't think that means that the Debian release needs to match. That's just my opinion, of course. I don't know if that collides with existing policies.

AIUI thumbor is abandoned in production (and maybe upstream too), and the lack of an owner is blocking the upgrade.

Can you clarify what you mean by abandoned in production? I'm hoping that this means thumbor is unused in production, but I don't think that's what you mean.

I'm mostly unsure of how many other issues you're going to hit along the way too.

I assume it will be a process. :-)

deployment-imagescaler03 is an outlier in deployment-prep in that I can't upgrade scap on it.

Is this not an issue in production too then?

That's a good question. Can you log into a production image scaler host and check the version of scap?

Btw, here's what happens when I try apt upgrade scap on deployment-imagescaler03.deployment-prep.eqiad.wmflabs:

# apt upgrade scap
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 scap : Depends: python3-sphinx but it is not going to be installed
        Depends: python3-sphinxcontrib.actdiag but it is not going to be installed
        Depends: python3-sphinxcontrib.blockdiag but it is not going to be installed
        Depends: python3-sphinxcontrib.programoutput but it is not going to be installed
E: Broken packages

I wasn't able to come up with any combination of commands to work around this. I assume the issue was with it running Stretch but perhaps that's totally wrong.
Advice is welcome!

Functionally yes. I don't think that means that the Debian release needs to match. That's just my opinion, of course. I don't know if that collides with existing policies.

Mostly the versions of imagemagick, rsvg, etc. will be different. But they'll be newer, so that's probably a good thing. And given that no one is really touching thumbor anyways, it's unlikely that we'd miss some regression in prod that could've been caught in beta. So +1 from me now that I've thought about it a bit more :)

AIUI thumbor is abandoned in production (and maybe upstream too), and the lack of an owner is blocking the upgrade.

Can you clarify what you mean by abandoned in production? I'm hoping that this means thumbor is unused in production, but I don't think that's what you mean.

There is no WMF team (or volunteers with appropriate access) responsible for managing thumbor or fixing non-emergency bugs (which SRE will end up doing). It's very much core functionality, still serving all thumbnails. Just a failure of WMF prioritization/resourcing.

deployment-imagescaler03 is an outlier in deployment-prep in that I can't upgrade scap on it.

Is this not an issue in production too then?

That's a good question. Can you log into a production image scaler host and check the version of scap?

All thumbor servers are running scap 4.0.0-1. All MW servers appear to be running 4.0.2-1. AFAIS the newer scap release hasn't been built/uploaded to stretch-wikimedia.

Btw, here's what happens when I try apt upgrade scap on deployment-imagescaler03.deployment-prep.eqiad.wmflabs:

The following packages have unmet dependencies:
 scap : Depends: python3-sphinx but it is not going to be installed
        Depends: python3-sphinxcontrib.actdiag but it is not going to be installed
        Depends: python3-sphinxcontrib.blockdiag but it is not going to be installed
        Depends: python3-sphinxcontrib.programoutput but it is not going to be installed
E: Broken packages

Well all of those seem available in stretch, so something else is wrong. But why does scap depend upon sphinx in the first place? It should only be a build-time dependency. I'll look into the packaging a bit closer.

Well all of those seem available in stretch, so something else is wrong. But why does scap depend upon sphinx in the first place? It should only be a build-time dependency. I'll look into the packaging a bit closer.

Thanks. In this case it's trying to install a scap that was built by https://integration.wikimedia.org/ci/view/Beta/job/beta-build-scap-deb/
The code to build that is defined in integration/config/jjb/beta.yaml:42

The Depends section of mediawiki/tools/scap/debian/control does have these entries:

python3-sphinx,
python3-sphinxcontrib.actdiag,
python3-sphinxcontrib.blockdiag,
python3-sphinxcontrib.programoutput,

It would be great if they turned out not to be necessary.

Change 734774 had a related patch set uploaded (by Legoktm; author: Legoktm):

[mediawiki/tools/scap@master] debian: Remove build dependencies from scap binary package

https://gerrit.wikimedia.org/r/734774

Change 734774 merged by jenkins-bot:

[mediawiki/tools/scap@master] debian: Remove build dependencies from scap binary package

https://gerrit.wikimedia.org/r/734774

I was able to install the latest scap deb (build by beta-scap-build-deb) on deployment-imagescaler03.deployment-prep.eqiad.wmflabs after @Legoktm 's mods to Scap.

The original motivation for this ticket has been resolved in a simpler way, so marking this ticket as declined. Thanks @Legoktm !

Andrew added a subscriber: Andrew.

deployment-imagescaler03.deployment-prep.eqiad.wmflabs is running Debian Stretch and so needs to be replaced with a Buster or Bullseye VM. See T306068 for context -- this is the only remaining Stretch VM in deployment-prep.

This comment was removed by Andrew.

sorry, wrong window! This still needs to be replaced. I may shut down the stretch VM soon in order to get attention to this task.

I may shut down the stretch VM soon in order to get attention to this task.

I think this is a reasonable forward step.

dancy removed dancy as the assignee of this task.Tue, Jan 17, 11:17 PM

Unassigning myself since I don't have any actions on this.

This VM is shut off. We'll see what comes.

Mentioned in SAL (#wikimedia-cloud) [2023-01-18T15:31:02Z] <andrewbogott> shutting down deployment-imagescaler03 as it is long-overdue for replacement. See T294148 for details.