Page MenuHomePhabricator

Converge / update image version across shellbox service instances (cleanup)
Closed, ResolvedPublic

Description

Currently, all shellbox service instances are running 2025-01-07-141744 with the exception of shellbox-media, which is running the newer 2025-03-04-121606 image version. That was done in order to pick up a new PHP 8.1 base image that appropriately sets display_startup_errors (see T377038#10598675).

Between these versions, there are a couple of low-risk dependency updates, and one notable code change: https://gerrit.wikimedia.org/r/1117968 which switches the call action - i.e., relevant to shellbox-constraints - from call_user_func_array to dynamic function call.

In addition, the newer PHP 8.1 base image also picks up the upgrade from PCRE2 10.26 to 10.42 (T386006), which is again primarily relevant to shellbox-constraints. While shellbox server does use PCRE functions, they're fairly straightforward use cases that are unlikely to change behavior across PCRE2 versions, in contrast to arbitrary user-supplied functions in the constraints case.

In any case, we should converge shellbox service instances toward the newer image version, potentially with some additional diligence for shellbox-constraints.

@Lucas_Werkmeister_WMDE - Would you happen to have any concerns / thoughts on the shellbox-constraints aspect based on the above?

Event Timeline

I think it’s okay to go ahead with this 👍

Change #1127188 had a related patch set uploaded (by Scott French; author: Scott French):

[operations/deployment-charts@master] shellbox: align image version to 2025-03-04-121606

https://gerrit.wikimedia.org/r/1127188

[17:58]  <    bd808> I have a new shellbox version to push out for T364249 (unblocked by dropping PHP 7.4 tests earlier today). It has been like a year and a half since I did a shellbox deployment. Is it still considered best practice to update all of the deployments after bumping the image version in helmfile.d/services/shellbox/global.yaml? 

I was going to update everything per https://wikitech.wikimedia.org/wiki/Shellbox#Deploying_a_new_version to the new 2025-06-05-215815 containers to deploy T364249: New upstream release for Pygments (2.18.0), but then I ran across this task from a conflict gerrit listed for my version bump at https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1154132. That pygments upgrade has been delayed for over a year so a few more days won't hurt anything.

@bd808 - Thanks for flagging! Indeed, this fell by the wayside while dealing with other aspects of the PHP migration, and for lack of any urgent changes that needed to be deployed.

This should not block your work, so please feel free to go ahead. If you'd prefer to only update syntaxhighlight so you can focus specifically on validation of that service alone, that's entirely acceptable (probably good, even!) and I'll follow up to catch up the other service instances to the same release.

That would involve adding a version: "2025-06-05-215815" to the shellbox map in syntaxhighlight's values file.

Scanning through the changes merged since the summary in the task description was collected, the only notable one I see was the bump to wikimedia/wikipeg 5.0.0. Production appears to be running 4.0.0 as of the 2025-01-07-141744 image. As long as that doesn't carry any notable risk, beyond what could sneak through the PEG parser tests for ShellParser, that seems fine?

Change #1154840 had a related patch set uploaded (by Scott French; author: Scott French):

[operations/deployment-charts@master] shellbox-video: upgrade image to 2025-06-05-215815

https://gerrit.wikimedia.org/r/1154840

Change #1154840 merged by jenkins-bot:

[operations/deployment-charts@master] shellbox-video: upgrade image to 2025-06-05-215815

https://gerrit.wikimedia.org/r/1154840

The 2025-06-05-215815 image is live in shellbox-video as of ~ 17:10 UTC. No issues observed so far. I'll wait for https://gerrit.wikimedia.org/r/1154132 to be deployed (and soak for a bit) before moving ahead with the remaining shellbox instances.

Change #1127188 merged by jenkins-bot:

[operations/deployment-charts@master] shellbox: align image version to 2025-06-05-215815

https://gerrit.wikimedia.org/r/1127188

The remaining shellbox instances have been updated everywhere as of 18:52 UTC today. Looking at general service health and some of the use-case-specific logstash queries from T377038, all looks well.