Page MenuHomePhabricator

CI for Wikidata Query Service UI broken (line 10: syntax error near unexpected token `(')
Closed, ResolvedPublic

Description

CI for wikidata/query/gui seems to be broken:

https://integration.wikimedia.org/ci/job/generic-node12-browser-webdriver-docker/4/console

[generic-node12-browser-webdriver-docker] $ /bin/bash /tmp/jenkins16351039860447317896.sh
+ set -o pipefail
/tmp/jenkins16351039860447317896.sh: line 10: syntax error near unexpected token `('
Build step 'Execute shell' marked build as failure

I think the last successful build on the repository was https://integration.wikimedia.org/ci/job/generic-node12-browser-webdriver-docker/1/console, on June 15th, for this change.

Event Timeline

Just above the failed line, the command chmod 2777 src is being executed. There’s only one codesearch match for that, in jjb/macro-docker.yaml; the failing command sets the pipefail option, and that file also has a match for pipefail:

# Run a Docker container using `docker run`. This builder should be used
# wherever possible to ensure the proper common arguments that allow for
# correct signal passing and cleanup are passed.
- builder:
    name: docker-run
    builders:
     - shell: |
        #!/bin/bash
        set -eux
        set -o pipefail
        exec docker run {obj:options|} \
          --security-opt seccomp=unconfined \
          --init \
          --rm \
          --label "jenkins.job=$JOB_NAME" \
          --label "jenkins.build=$BUILD_NUMBER" \
          --env-file <(/usr/bin/env|egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)=') \
          '{image}' {obj:args|}
        # nothing else can be executed due to exec

The --env-file <(...) line is line 10 of that script; the script starts with #!/bin/bash, but maybe it’s being run by a non-Bash shell (e.g. Dash)? That could explain the error (<(), process substitution, is a Bash-specific feature).

Failed builds we’ve seen so far:

  • #2, on integration-agent-docker-1012
  • #3, on integration-agent-docker-1016
  • #4, on integration-agent-docker-1005

So whatever the problem is, it’s not limited to one agent machine.

If I read the integration config correctly, the generic-node12-browser-webdriver-docker job runs that script using the docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2 image, and the Bash in there supports process substitution just fine as far as I can tell:

$ docker run --entrypoint=/bin/bash docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2 -c "cat <(/usr/bin/env|egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)=')"
NPM_CONFIG_update_notifier=false
HOSTNAME=1e1a70e80bcf
LANGUAGE=en_US:en
CHROMIUM_FLAGS=--no-sandbox
BABEL_CACHE_PATH=/cache/babel-cache.json
XDG_CONFIG_HOME=/tmp/xdg-config-home
PWD=/src
SPAWN_WRAP_SHIM_ROOT=/tmp/xdg-config-home
LANG=en_US.UTF-8
XDG_CACHE_HOME=/cache
SHLVL=0
NPM_CONFIG_cache=/cache
CHROME_BIN=/usr/bin/chromium
FIREFOX_BIN=/usr/local/bin/firefox
LC_ALL=en_US.UTF-8
_=/usr/bin/env

No, wait, docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2 is the image that will be executed by that docker run command inside the script. I don’t know where the script runs. Directly on the Jenkins host?

No, wait, docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2 is the image that will be executed by that docker run command inside the script. I don’t know where the script runs. Directly on the Jenkins host?

From the logs, I would interpret the beginning of the prompt [generic-node12-browser-webdriver-docker] $ [...] as meaning that we are inside of some node12 container?

No, wait, docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2 is the image that will be executed by that docker run command inside the script. I don’t know where the script runs. Directly on the Jenkins host?

From the logs, I would interpret the beginning of the prompt [generic-node12-browser-webdriver-docker] $ [...] as meaning that we are inside of some node12 container?

I think this just means that Jenkins is currently running the generic-node12-browser-webdriver-docker job (configured as {name}-node12-browser-webdriver-docker in jjb/job-templates.yaml, which uses the docker-run-with-log-cache-src builder, and that ultimately comes down to the docker-run builder we’ve seen in T286058#7194166. This builder is supposed to run a node12 container, but I don’t think the builder itself runs in that container.

No, wait, docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2 is the image that will be executed by that docker run command inside the script. I don’t know where the script runs. Directly on the Jenkins host?

From the logs, I would interpret the beginning of the prompt [generic-node12-browser-webdriver-docker] $ [...] as meaning that we are inside of some node12 container?

I think this just means that Jenkins is currently running the generic-node12-browser-webdriver-docker job (configured as {name}-node12-browser-webdriver-docker in jjb/job-templates.yaml, which uses the docker-run-with-log-cache-src builder, and that ultimately comes down to the docker-run builder we’ve seen in T286058#7194166. This builder is supposed to run a node12 container, but I don’t think the builder itself runs in that container.

Thanks, that is a plausible explanation.

Misleading prompt is misleading.

Change 703212 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[integration/config@master] Fix stray quote in generic-node12-browser-webdriver-docker

https://gerrit.wikimedia.org/r/703212

Change 703212 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[integration/config@master] Fix stray quote in generic-node12-browser-webdriver-docker

https://gerrit.wikimedia.org/r/703212

I got very lucky and noticed a leftover ' in the options while looking through the git log. I think that would make the command expand to:

exec docker run --entrypoint=/run-with-xvfb.sh --shm-size 1g' --env LOG_DIR=/log \
  --security-opt seccomp=unconfined \
  --init \
  --rm \
  --label "jenkins.job=$JOB_NAME" \
  --label "jenkins.build=$BUILD_NUMBER" \
  --env-file <(/usr/bin/env|egrep -v '^(HOME|SHELL|PATH|LOGNAME|MAIL)=') \
  'docker-registry.wikimedia.org/releng/node12-test-browser:0.0.2'

which, if you look at the syntax highlighting, means that the unexpected ( is actually that of ^(HOME|…, not that of <(…:

$ echo 'foo'^(HOME
-bash: syntax error near unexpected token `('

Change 703212 merged by jenkins-bot:

[integration/config@master] Fix stray quote in generic-node12-browser-webdriver-docker

https://gerrit.wikimedia.org/r/703212

Change 703217 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[wikidata/query/gui@master] DNM: empty change to test CI

https://gerrit.wikimedia.org/r/703217

Lucas_Werkmeister_WMDE claimed this task.
Lucas_Werkmeister_WMDE moved this task from Backlog to Done on the Wikidata Query UI board.

Change 703217 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[wikidata/query/gui@master] DNM: empty change to test CI

https://gerrit.wikimedia.org/r/703217

CI passed on that change, so I think this is resolved.

Change 703217 abandoned by Lucas Werkmeister (WMDE):

[wikidata/query/gui@master] DNM: empty change to test CI

Reason:

https://gerrit.wikimedia.org/r/703217