Page MenuHomePhabricator

mwext-node20-docs-publish failing post-merge for CodeMirror
Open, Needs TriagePublicBUG REPORT

Description

This is a repeat of T373937. The core issue is believed to be T295351: npm cache saved by castor get corrupted for unknown reason.

Example failure: https://integration.wikimedia.org/ci/job/mwext-node20-docs-publish/4818/console

06:56:21 npm error enoent Invalid response body while trying to fetch https://registry.npmjs.org/@types%2fresolve: ENOENT: no such file or directory, stat '/cache/_cacache/content-v2/sha512/0d/a3/18620d85d43f6ab93047416a081605d8a28d8cac126129373a76e676d1d9b9340d7078cd9448ea029c311926f5d217e7da20fe0276fab3a817b971de2983'
06:56:21 npm error enoent This is related to npm not being able to find a file.
06:56:21 npm error enoent
06:56:21 npm error A complete log of this run can be found in: /cache/_logs/2025-09-21T10_56_00_783Z-debug-0.log
06:56:22 Build step 'Execute shell' marked build as failure
06:56:22 [PostBuildScript] - [INFO] Executing post build scripts.
06:56:22 [PostBuildScript] - [INFO] Build does not have any of the results [SUCCESS]. Did not execute build step #0.
06:56:22 [PostBuildScript] - [INFO] Executing post build scripts.
06:56:22 [mwext-node20-docs-publish] $ /bin/bash -xe /tmp/jenkins781667365482261487.sh
06:56:22 + echo 'Clearing /srv/jenkins/workspace/mwext-node20-docs-publish/cache'
06:56:22 Clearing /srv/jenkins/workspace/mwext-node20-docs-publish/cache
06:56:22 [mwext-node20-docs-publish] $ /bin/bash /tmp/jenkins11663209324604808410.sh
06:56:22 + set +x
06:56:22 + exec docker run --volume /srv/jenkins/workspace/mwext-node20-docs-publish/cache:/cache --security-opt seccomp=unconfined --init --rm --label jenkins.job=mwext-node20-docs-publish --label jenkins.build=4818 --env-file /dev/fd/63 docker-registry.wikimedia.org/releng/castor:0.4.0 clear
06:56:22 ++ set +x
06:56:26 [PostBuildScript] - [INFO] Executing post build scripts.
06:56:26 [mwext-node20-docs-publish] $ /bin/bash -xe /tmp/jenkins6181861359729857836.sh
06:56:26 + set -euxo pipefail
06:56:26 + docker ps -q --filter label=jenkins.job=mwext-node20-docs-publish --filter label=jenkins.build=4818
06:56:26 + xargs --no-run-if-empty docker stop
06:56:26 [PostBuildScript] - [INFO] Executing post build scripts.
06:56:26 [mwext-node20-docs-publish] $ /bin/bash /tmp/jenkins8055869012480843700.sh
06:56:26 + set +x
06:56:26 + exec docker run --entrypoint=/usr/bin/find --user=root --volume /srv/jenkins/workspace/mwext-node20-docs-publish:/workspace --security-opt seccomp=unconfined --init --rm --label jenkins.job=mwext-node20-docs-publish --label jenkins.build=4818 --env-file /dev/fd/63 docker-registry.wikimedia.org/bookworm:latest /workspace -mindepth 1 -delete
06:56:26 ++ set +x
06:56:27 [mwext-node20-docs-publish] $ /bin/bash -xe /tmp/jenkins4038843988122693634.sh
06:56:27 + echo 'Listing potentially remaining files in workspace for T282893'
06:56:27 Listing potentially remaining files in workspace for T282893
06:56:27 + ls -laF --color=always
06:56:27 total 8
06:56:27 drwxr-xr-x  2 jenkins-deploy wikidev 4096 Sep 21 10:56 ./
06:56:27 drwxrwxr-x 20 jenkins-deploy wikidev 4096 Sep 21 10:53 ../
06:56:27 [mwext-node20-docs-publish] $ /bin/bash -xe /tmp/jenkins14690048591247215757.sh
06:56:27 + set -u
06:56:27 + rmdir /srv/jenkins/workspace/mwext-node20-docs-publish
06:56:27 Finished: FAILURE

As a result, the CodeMirror docs are not being updated. I tried to repeat what was done before, but I am not a member of the integration VPS project.

I'd say for CodeMirror, we need to bust the cache maybe 1-2 times a month. I'm wondering if I could possibly be added to the integration project so that I can clear the cache myself?

Event Timeline

Mentioned in SAL (#wikimedia-releng) [2025-09-22T10:10:09Z] <James_F> sudo -u jenkins-deploy rm -rf /srv/castor/castor-mw-ext-and-skins/master/mwext-node20-docs-publish/ # T405166

Should now be fixed.

I'd say for CodeMirror, we need to bust the cache maybe 1-2 times a month

That seems extremely high; this is possibly made worse by the recent work to remove CI recursal, perhaps?

I'd say for CodeMirror, we need to bust the cache maybe 1-2 times a month

That seems extremely high; this is possibly made worse by the recent work to remove CI recursal, perhaps?

I'm not sure, but CodeMirror makes heavy use of NPM modules. Occasionally we need to bump packages, and I suspect that contributes to why this issue happens to CodeMirror so often.