Page MenuHomePhabricator

Update jdk17 toolforge image to 17.0.4
Closed, ResolvedPublic

Description

I'm using jdk17 image to run my tool on toolforge as follows:

webservice --cpu 1 --mem 6Gi jdk17 start /data/project/spacemedia/run.sh

The image contains OpenJDK 17.0.3 and I'm facing this bug:
https://bugs.openjdk.org/browse/JDK-8274735
It has been solved in 17.0.4:
https://bugs.openjdk.org/browse/JDK-8285948

Can you please update the jdk17 image to 17.0.4?
Thanks!

Event Timeline

The relevant Debian package has 17.0.4 available, but only in stable-security and stable-proposed-updates. Since the base images don't have either of those repose enabled, I don't think there's an easy way to do this before the next Debian point release that's likely happening in couple of weeks (#debian-devel topic says 2022-09-10).

Could the repos be enabled? It would make sense to enable stable-* no? (especially stable-security)

Debian 11.5 has been released, is the package available now?

I did a local test build of the jdk17-sssd-web image and it does get openjdk 17.0.4+8-Debian-1deb11u1 now if there are no cached layers.

I think it has been a while since we did a full rebuild of all of our images to pick up bug fix package changes like this. It is probably worth doing soon.

Mentioned in SAL (#wikimedia-cloud) [2022-10-06T00:36:05Z] <bd808> Rebuilding all Toolforge docker images to pick up bug and security fix packages. (T316554)

Mentioned in SAL (#wikimedia-cloud) [2022-10-06T00:39:37Z] <bd808> Image rebuild failing with debian apt repo signature issue. Will investigate tomorrow. (T316554)

Mentioned in SAL (#wikimedia-cloud) [2022-10-12T20:31:31Z] <bd808> Rebuilding all Toolforge docker images to pick up bug and security fix packages after fixing bug in building the bullseye base image. (T316554)

Mentioned in SAL (#wikimedia-cloud) [2022-10-12T20:31:31Z] <bd808> Rebuilding all Toolforge docker images to pick up bug and security fix packages after fixing bug in building the bullseye base image. (T316554)

This failed again unexpectedly. I'm trying to understand why.

Mentioned in SAL (#wikimedia-cloud) [2022-10-12T20:43:48Z] <bd808> Rebuilding all Toolforge docker images to pick up bug and security fix packages. Third try seems to be working. (T316554)

This failed again unexpectedly. I'm trying to understand why.

Is there a cache invalidation problem in the clunky rebuild_all.sh automation that I wrote in the long ago? The script attempts to build and push the base image for each os version as a single image build before attempting to build the whole series. The initial single build is intended to prime local cache and ensure that things like the apt config bug from T320100: Building of 'bullseye-sssd' image failing on tools-docker-imagebuilder-01 are seen quickly by the operator. I'm not exactly sure why, but it seems that this priming does not work as hoped on the first invocation. The build I started in T316554#8312840 succeeded in building and pushing the buster base image, but then failed on building the base image again when starting the full buster lineage build. After piddling around with clearing docker's builder and image caches I started the rebuild_all.sh script for a second time. This time it worked, and honestly the cache invalidation theory is a complete guess for why it would suddenly be working.

$ become bd808-test
$ webservice jdk17 shell
tools.bd808-test@shell-1665620487:~$ java --version
openjdk 17.0.4 2022-07-19
OpenJDK Runtime Environment (build 17.0.4+8-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 17.0.4+8-Debian-1deb11u1, mixed mode, sharing)