Page MenuHomePhabricator

Cancelled Jenkins build do not run postbuild script
Open, LowPublic

Description

Our job have a post build step which to clean up the workspace. When a build is cancelled by Zuul, the post build scripts do not run which leave the workspace on the agent as is consuming needless disk space.

Example builds:

<s> [webpack.Progress] 68% building 900/931 modules 31 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_baseIteratee.js
Build was aborted
Aborted by anonymous
Archiving artifacts
[PostBuildScript] - [INFO] Executing post build scripts.
[PostBuildScript] - [INFO] Build does not have any of the results [SUCCESS]. Did not execute build step #0.
[PostBuildScript] - [INFO] Executing post build scripts.
[mwext-node12-rundoc-docker@2] $ /bin/bash -xe /tmp/jenkins18287764908602004871.sh
+ echo 'Clearing /srv/jenkins/workspace/workspace/mwext-node12-rundoc-docker@2/cache'
Clearing /srv/jenkins/workspace/workspace/mwext-node12-rundoc-docker@2/cache
[mwext-node12-rundoc-docker@2] $ /bin/bash /tmp/jenkins7304893211906477564.sh
+ set -o pipefail
++ pwd
+ exec docker run --volume /srv/jenkins/workspace/workspace/mwext-node12-rundoc-docker@2/cache:/cache --security-opt seccomp=unconfined --init --rm --label jenkins.job=mwext-node12-rundoc-docker --label jenkins.build=10357 --env-file /dev/fd/63 docker-registry.wikimedia.org/releng/castor:0.2.4 clear
++ /usr/bin/env
++ egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)='
[PostBuildScript] - [ERROR] An error occured during post-build processing.
org.jenkinsci.plugins.postbuildscript.PostBuildScriptException: java.lang.InterruptedException
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:190)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processScripts(Processor.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:79)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:73)
	at org.jenkinsci.plugins.postbuildscript.PostBuildScript.perform(PostBuildScript.java:116)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
	at hudson.model.Build$BuildExecution.post2(Build.java:178)
	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
	at hudson.model.Run.execute(Run.java:1913)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:99)
	at hudson.model.Executor.run(Executor.java:432)
Caused by: java.lang.InterruptedException
	at java.base/java.lang.Object.wait(Native Method)
	at hudson.remoting.Request.call(Request.java:177)
	at hudson.remoting.Channel.call(Channel.java:1000)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286)
	at com.sun.proxy.$Proxy91.join(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1199)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:194)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:144)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:180)
	... 13 more
Build step 'Execute scripts' marked build as failure
[PostBuildScript] - [INFO] Executing post build scripts.
[mwext-node12-rundoc-docker@2] $ /bin/bash -xe /tmp/jenkins7460517696265297304.sh
+ set -euxo pipefail
+ docker ps -q --filter label=jenkins.job=mwext-node12-rundoc-docker --filter label=jenkins.build=10357
+ xargs --no-run-if-empty docker stop
FATAL: Unable to delete script file /tmp/jenkins7460517696265297304.sh
java.lang.InterruptedException
	at java.base/java.lang.Object.wait(Native Method)
	at hudson.remoting.Request.call(Request.java:177)
	at hudson.remoting.Channel.call(Channel.java:1000)
	at hudson.FilePath.act(FilePath.java:1165)
	at hudson.FilePath.act(FilePath.java:1154)
	at hudson.FilePath.delete(FilePath.java:1681)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:162)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:180)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processScripts(Processor.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:79)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:73)
	at org.jenkinsci.plugins.postbuildscript.PostBuildScript.perform(PostBuildScript.java:116)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
	at hudson.model.Build$BuildExecution.post2(Build.java:178)
	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
	at hudson.model.Run.execute(Run.java:1913)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:99)
	at hudson.model.Executor.run(Executor.java:432)
[PostBuildScript] - [ERROR] An error occured during post-build processing.
org.jenkinsci.plugins.postbuildscript.PostBuildScriptException: java.lang.InterruptedException
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:190)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processScripts(Processor.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:79)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:73)
	at org.jenkinsci.plugins.postbuildscript.PostBuildScript.perform(PostBuildScript.java:116)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
	at hudson.model.Build$BuildExecution.post2(Build.java:178)
	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
	at hudson.model.Run.execute(Run.java:1913)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:99)
	at hudson.model.Executor.run(Executor.java:432)
Caused by: java.lang.InterruptedException
	at java.base/java.lang.Object.wait(Native Method)
	at hudson.remoting.Request.call(Request.java:177)
	at hudson.remoting.Channel.call(Channel.java:1000)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:286)
	at com.sun.proxy.$Proxy91.join(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1199)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:194)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:144)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:180)
	... 13 more
Build step 'Execute scripts' marked build as failure
[PostBuildScript] - [INFO] Executing post build scripts.
xargs: docker: terminated by signal 15
[PostBuildScript] - [ERROR] An error occured during post-build processing.
org.jenkinsci.plugins.postbuildscript.PostBuildScriptException: java.lang.InterruptedException
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:190)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processScripts(Processor.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:79)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.process(Processor.java:73)
	at org.jenkinsci.plugins.postbuildscript.PostBuildScript.perform(PostBuildScript.java:116)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:806)
	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:755)
	at hudson.model.Build$BuildExecution.post2(Build.java:178)
	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:699)
	at hudson.model.Run.execute(Run.java:1913)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:99)
	at hudson.model.Executor.run(Executor.java:432)
Caused by: java.lang.InterruptedException
	at java.base/java.lang.Object.wait(Native Method)
	at hudson.remoting.Request.call(Request.java:177)
	at hudson.remoting.Channel.call(Channel.java:1000)
	at hudson.FilePath.act(FilePath.java:1165)
	at hudson.FilePath.act(FilePath.java:1154)
	at hudson.FilePath.createTextTempFile(FilePath.java:1580)
	at hudson.tasks.CommandInterpreter.createScriptFile(CommandInterpreter.java:201)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:119)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
	at org.jenkinsci.plugins.postbuildscript.processor.Processor.processBuildSteps(Processor.java:180)
	... 13 more
Build step 'Execute scripts' marked build as failure
<s> [webpack.Progress] 67% building 900/932 modules 32 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_assocIndexOf.js
<s> [webpack.Progress] 68% building 901/932 modules 31 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_assocIndexOf.js
<s> [webpack.Progress] 68% building 902/932 modules 30 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_assocIndexOf.js
Finished: ABORTED

Event Timeline

My fringe theory of how it happens is:

Zuul cancels a build

Jenkins sends SIGTERM to the process group which abort the running command (well more or less cause Docker does not necessarily propagate the signal iirc and the process inside the container might well shallow the signal, typically bash does iirc).

Jenkins runs the postbuild script such as:

docker ps -q --filter label=jenkins.job=mwext-node12-rundoc-docker --filter label=jenkins.build=10357 | xargs --no-run-if-empty docker stop

Docker got a SIGTERM as indicated by:

xargs: docker: terminated by signal 15

So I think the issue is how Jenkins abort the process by killing the whole process group. It does not take in account that the Post Build Script plugin is spawning processes and or the plugin should start after Jenkins finished killing all processes.

TLDR: that is an issue to be reported to upstream

I have created a dummy job https://integration.wikimedia.org/ci/job/T2999995%20-%20cancel%20and%20postbuild/ which runs:

set +x

trap "echo '[BUILD] got SIGTERM'" SIGTERM
trap "echo '[BUILD] reached EXIT'" EXIT

echo "Starting fake build. Please manually cancel the build."
for i in $(seq 1 100); do
    printf "[BUILD] step #%s\n" "$i"
    sleep 1
done;

And as a postbuild script:

set +x

trap "echo '>post< got SIGTERM'" SIGTERM
trap "echo '>post< script reached EXIT'" EXIT

for i in $(seq 1 100); do
    printf ">post< step #%s\n" "$i"
    sleep 0.2
done;

And that works as expected when canceled via the web ui:

00:00:00.000 Started by user Hashar
00:00:00.000 Running as SYSTEM
00:00:00.002 Building remotely on integration-agent-docker-1012 (pipelinelib Docker blubber) in workspace /srv/jenkins/workspace/workspace/T2999995 - cancel and postbuild
00:00:00.039 [T2999995 - cancel and postbuild] $ /bin/bash -xe /tmp/jenkins9087666617933396520.sh
00:00:00.113 + set +x
00:00:00.113 Starting fake build. Please manually cancel the build.
00:00:00.141 [BUILD] step #1
00:00:01.119 [BUILD] step #2
00:00:02.121 [BUILD] step #3
00:00:03.124 [BUILD] step #4
00:00:04.127 [BUILD] step #5
00:00:05.129 [BUILD] step #6
00:00:05.209 Terminated
00:00:05.209 [BUILD] reached EXIT
00:00:05.232 Build was aborted
00:00:05.232 Aborted by Hashar
00:00:05.232 [PostBuildScript] - [INFO] Executing post build scripts.
00:00:05.303 [T2999995 - cancel and postbuild] $ /bin/bash -xe /tmp/jenkins3179273151456297835.sh
00:00:05.308 + set +x
00:00:05.309 >post< step #1
00:00:05.510 >post< step #2
00:00:05.713 >post< step #3
00:00:05.916 >post< step #4
00:00:06.118 >post< step #5
00:00:06.321 >post< step #6
00:00:06.523 >post< step #7

The issue is more complicated in https://integration.wikimedia.org/ci/job/mwext-node12-rundoc-docker/10357/

00:03:47.040 <s> [webpack.Progress] 68% building 900/931 modules 31 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_baseIteratee.js
00:03:47.041 Build was aborted
00:03:50.771 Aborted by anonymous
00:03:50.771 Archiving artifacts

There is a 3.730 seconds delay.

And at the very end of the build we can see the build step still emitted output:

00:03:52.132 <s> [webpack.Progress] 67% building 900/932 modules 32 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_assocIndexOf.js
00:03:52.293 <s> [webpack.Progress] 68% building 901/932 modules 31 active /src/view/lib/wikibase-tainted-ref/node_modules/@storybook/components/node_modules/lodash/_assocIndexOf.js

The build step is a exec docker run:

#!/bin/bash
set -eux
set -o pipefail
exec docker run --shm-size 1g --volume "$(pwd)"/src:/src --volume "$(pwd)"/cache:/cache --volume "$(pwd)"/log:/log \
  --security-opt seccomp=unconfined \
  --init \
  --rm \
  --label "jenkins.job=$JOB_NAME" \
  --label "jenkins.build=$BUILD_NUMBER" \
  --env-file <(/usr/bin/env|egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)=') \
  'docker-registry.wikimedia.org/releng/node12-test-browser:0.0.3-s2' doc
# nothing else can be executed due to exec

The image has:

ENTRYPOINT ["/run.sh"]

Which thus executes the shell script that ultimately has:

npm run-script "${@:-test}"

What I suspect is that Jenkins sends a SIGTERM to docker run which hopefully relays the signal to the Docker daemon which then send the SIGTERM to bash. Bash would happily ignore SIGTERM. Thus Jenkins still see the docker run process running and then decide to kill the whole process group, but by the time it does so, the post build script already kicked in and get killed as well.

In Jenkins code that would be core/src/main/java/hudson/util/ProcessTree.java there are some nice bits such as:

private final long softKillWaitSeconds = Integer.getInteger("SoftKillWaitSeconds", 5);

        /**
         * Tries to kill this process.
         */
        @Override
        public void kill() throws InterruptedException {
            // after sending SIGTERM, wait for the process to cease to exist
            long deadline = System.nanoTime() + TimeUnit.SECONDS.toNanos(softKillWaitSeconds);
            kill(deadline);
        }

        private void kill(long deadline) throws InterruptedException {
            if (getVeto() != null) 
                return;
            try {
                int pid = getPid();
                LOGGER.fine("Killing pid="+pid);
                UnixReflection.destroy(pid);
                // after sending SIGTERM, wait for the process to cease to exist
                int sleepTime = 10; // initially we sleep briefly, then sleep up to 1sec
                File status = getFile("status");
                do {
                    if (!status.exists()) {
                        break; // status is gone, process therefore as well
                    }

                    Thread.sleep(sleepTime);
                    sleepTime = Math.min(sleepTime * 2, 1000);
                } while (System.nanoTime() < deadline);
            } catch (IllegalAccessException e) {
                // this is impossible
                IllegalAccessError x = new IllegalAccessError();
                x.initCause(e);
                throw x;
            } catch (InvocationTargetException e) {
                // tunnel serious errors
                if(e.getTargetException() instanceof Error)
                    throw (Error)e.getTargetException();
                // otherwise log and let go. I need to see when this happens
                LOGGER.log(Level.INFO, "Failed to terminate pid="+getPid(),e);
            }
            killByKiller();
        }

There are a few more methods which apparently all involve that softKillWaitSeconds of 5 seconds.

Ah related investigation from 2018 is at T176747. It referred to https://github.com/jenkinsci/jenkins/commit/d8eac92ee9a1c19bf145763589f1c152607bf3ed which introduced the softKillWaitSeconds described above.

And on that task T176747#3749436 I said that docker run --tty did not relay signals which eventually got fixed by Docker 19.03.5 with https://github.com/docker/cli/pull/2177 . Then we do not use --tty. There was some side investigation at bash shallowing SIGTERM and docker run --init tentatively fixing that (by actually signaling the bash entry point and its subprocesses).

For a little repro:

run.sh
#!/bin/bash
echo "Running entrypoint..."

trap "echo '[BUILD] got SIGTERM'" SIGTERM
trap "echo '[BUILD] reached EXIT'" EXIT

echo "Starting fake build. Please manually cancel the build."
for i in $(seq 1 100); do
    printf "[BUILD] step #%s\n" "$i"
    sleep 1
done;
Dockerfile
FROM docker-registry.wikimedia.org/buster:latest
COPY run.sh /run.sh
ENTRYPOINT ["/run.sh"]

I am running it with: docker run --init --rm t299995 then docker stop:

Running entrypoint...
Starting fake build. Please manually cancel the build.
[BUILD] step #1
[BUILD] step #2
[BUILD] step #3
[BUILD] step #4
[BUILD] step #5
[BUILD] step #6
[BUILD] step #7
[BUILD] step #8
[BUILD] step #9
[BUILD] step #10
[BUILD] got SIGTERM
[BUILD] step #11
[BUILD] step #12
[BUILD] step #13
[BUILD] step #14
[BUILD] step #15
[BUILD] step #16
[BUILD] step #17
[BUILD] step #18
[BUILD] step #19
[BUILD] step #20

The SIGTERM is relayed and the container eventually get SIGKILL by docker stop after 10 seconds.

But without the trap "echo '[BUILD] got SIGTERM'" SIGTERM that gives:

Running entrypoint...
Starting fake build. Please manually cancel the build.
[BUILD] step #1
[BUILD] step #2
[BUILD] step #3
[BUILD] step #4
[BUILD] step #5
[BUILD] step #6
[BUILD] step #7
[BUILD] reached EXIT

That is with Docker 20.10.11. so looks like it is working as expected. Maybe the behavior is different when using a subprocess rather than sleep :-\

Will revisit later. It is a bit lof a long rabbit hole to find out what is wrong between Jenkins, docker client and the images we run.

https://issues.jenkins.io/browse/JENKINS-26994 is from 2015 and looks very similar. A comment by one of Jenkins maintainer stated at the time the plugin is a bit legacy and recommend to use the Flexible publisher plugin instead. JJB has support for it https://jenkins-job-builder.readthedocs.io/en/latest/publishers.html?highlight=flexible#publishers.conditional-publisher

There are a few other builds such as https://integration.wikimedia.org/ci/job/mwgate-node12-docker/73918/console which was a build for mediawiki/extensions/Wikibase

01:02:37 > echo 'disabled (T297381)' # ZUUL_BRANCH=${ZUUL_BRANCH:-master} lib-version-check
01:02:37 
01:02:37 disabled (T297381)
01:02:37 Build was aborted
01:03:30 Aborted by anonymous

There is a 53 seconds delay I believe due to the large Docker container being deleting. After that there is supposedly no process left and the postbuild scripts kick in. They get an InterruptedException after 10 seconds and 15 seconds. I supposed that is the soft kill retry which is set to five seconds.

01:03:30 Archiving artifacts
01:03:30 [PostBuildScript] - [INFO] Executing post build scripts.
01:03:30 [PostBuildScript] - [INFO] Build does not have any of the results [SUCCESS]. Did not execute build step #0.
01:03:30 [PostBuildScript] - [INFO] Executing post build scripts.
01:03:30 [mwgate-node12-docker] $ /bin/bash -xe /tmp/jenkins10100668009224099225.sh
01:03:30 + echo 'Clearing /srv/jenkins/workspace/workspace/mwgate-node12-docker/cache'
01:03:30 Clearing /srv/jenkins/workspace/workspace/mwgate-node12-docker/cache
01:03:30 [mwgate-node12-docker] $ /bin/bash /tmp/jenkins5978266604011610036.sh
01:03:31 + set -o pipefail
01:03:31 ++ pwd
01:03:31 + exec docker run --volume /srv/jenkins/workspace/workspace/mwgate-node12-docker/cache:/cache --security-opt seccomp=unconfined --init --rm --label jenkins.job=mwgate-node12-docker --label jenkins.build=73918 --env-file /dev/fd/63 docker-registry.wikimedia.org/releng/castor:0.2.4 clear
01:03:31 ++ /usr/bin/env
01:03:31 ++ egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)='
01:03:31 [PostBuildScript] - [ERROR] An error occured during post-build processing.
01:03:40 org.jenkinsci.plugins.postbuildscript.PostBuildScriptException: java.lang.InterruptedException
<STACK TRACE>

01:03:40 Build step 'Execute scripts' marked build as failure
01:03:40 [PostBuildScript] - [INFO] Executing post build scripts.
01:03:40 [mwgate-node12-docker] $ /bin/bash -xe /tmp/jenkins18356270474085839319.sh
01:03:40 + set -euxo pipefail
01:03:40 + docker ps -q --filter label=jenkins.job=mwgate-node12-docker --filter label=jenkins.build=73918
01:03:40 + xargs --no-run-if-empty docker stop
01:03:40 [PostBuildScript] - [INFO] Executing post build scripts.
01:03:40 [mwgate-node12-docker] $ /bin/bash /tmp/jenkins5243175199732981066.sh
01:03:40 + set -o pipefail
01:03:40 + exec docker run --entrypoint=/usr/bin/find --user=root --volume /srv/jenkins/workspace/workspace/mwgate-node12-docker:/workspace --security-opt seccomp=unconfined --init --rm --label jenkins.job=mwgate-node12-docker --label jenkins.build=73918 --env-file /dev/fd/63 docker-registry.wikimedia.org/buster:latest /workspace -mindepth 1 -delete
01:03:40 ++ /usr/bin/env
01:03:40 ++ egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)='
01:03:40 [PostBuildScript] - [ERROR] An error occured during post-build processing.
01:03:45 org.jenkinsci.plugins.postbuildscript.PostBuildScriptException: java.lang.InterruptedException
<STACK TRACE>

01:03:45 Build step 'Execute scripts' marked build as failure
01:03:45 Finished: ABORTED

I give up. I can't really reproduce by manually cancelling a build. It might be an issue related to the Gearman plugin interrupting or it is some edge case not taken in account by the PostBuild script plugin. Maybe we can try migrating to the Flexible plugin and see whether that improves things.

After exploring T282893, I suspect the issue is the Gearman plugin interrupting the Executor thread (rather than cancelling the job itself). The resulting InterruptedExpection bubbles up and is not handled by the postbuildscript plugin?