Page MenuHomePhabricator

publish-to-prometheus (...) completed. Result was FAILURE
Closed, ResolvedPublic

Description

Daily jobs are failing. Example error message. (Full terminal output: P78728.)

...
Waiting for the completion of publish-to-prometheus
publish-to-prometheus #65 started.
publish-to-prometheus #65 completed. Result was FAILURE
Build step 'Execute scripts' changed build result to FAILURE
Build step 'Execute scripts' marked build as failure
...

Failing jobs. All jobs are failing since since 2025-06-26.

Looks like publish-to-prometheus job failure is causing daily jobs to fail.

Event Timeline

zeljkofilipin updated the task description. (Show Details)
zeljkofilipin updated the task description. (Show Details)

Adding @hashar . The logs look like this:

https://integration.wikimedia.org/ci/job/publish-to-prometheus/72/console

01:16:21 + rsync --archive --stats --compress '--rsh=/usr/bin/ssh -a -T -o ConnectTimeout=6 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' 'jenkins-deploy@172.16.0.112:/srv/jenkins/workspace/selenium-daily-beta-VisualEditor/log//*.prom' .
01:16:21 Warning: Permanently added '172.16.0.112' (ECDSA) to the list of known hosts.
01:16:22 rsync: [sender] link_stat "/srv/jenkins/workspace/selenium-daily-beta-VisualEditor/log/*.prom" failed: No such file or directory (2)

Thinking it could be because visual editor hasn't updated to a new wdio-mediawiki, so prom file does not exist.

hashar removed a subscriber: Antoine.

Hmm in jjb/publish-to-prometheus.sh, I am supposedly taking in account there is no .prom file:

# Make pattern matching no file to return null instead of itself
shopt -s nullglob

PROM_FILES=(*.prom)
if [ ${#PROM_FILES[@]} -eq 0 ]; then
    echo "Could not find any *.prom file."
fi

But that breaks earlier when trying to fetch the files from the instance. There is rsync --ignore-missing-args which prevent it from erroring out when there are no files.

Change #1165553 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: allow missing source when fetching from instances

https://gerrit.wikimedia.org/r/1165553

Change #1165568 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: publish-to-prometheus: skip on lack of .prom files

https://gerrit.wikimedia.org/r/1165568

The next error is the code should exit early when there is no .prom file, I forgot to add exit 0 there. Done by https://gerrit.wikimedia.org/r/c/integration/config/+/1165568/

I have updated the publish-to-prometheus job but not publish-to-doc.

I have rebuild https://integration.wikimedia.org/ci/job/selenium-daily-beta-AdvancedSearch/ and it passed.

Change #1165553 merged by jenkins-bot:

[integration/config@master] jjb: allow missing source when fetching from instances

https://gerrit.wikimedia.org/r/1165553

Change #1165568 merged by jenkins-bot:

[integration/config@master] jjb: publish-to-prometheus: skip on lack of .prom files

https://gerrit.wikimedia.org/r/1165568