Page MenuHomePhabricator

Have PipelineBot link directly to build console output on test
Open, Needs TriagePublic

Description

As seen in https://gerrit.wikimedia.org/r/c/research/mwaddlink/+/650466/2#message-e26f791716aef9ab1ff210b1a3b07ed69efa65bd and requested by @Tgr, could we reduce the number of clicks needed to get to the deployment pipeline console output?

For example here https://gerrit.wikimedia.org/r/c/research/mwaddlink/+/650466/2#message-b306937be13e2656c85f53f37a57788ac09e1a11 we have "Main test build failed." and a link is provided to https://integration.wikimedia.org/ci/job/trigger-research-mwaddlink-pipeline-test/190/console

If I click on that I see:

11:08:06 Started by user unknown or anonymous
11:08:06 Running as SYSTEM
11:08:06 Building remotely on contint2001 (dockerPublish pipelinelib blubber productionAgents chartPromote train) in workspace /srv/jenkins-slave/workspace/trigger-research-mwaddlink-pipeline-test@3
11:08:07 Waiting for the completion of research-mwaddlink-pipeline-test
11:08:07 research-mwaddlink-pipeline-test #190 started.
11:09:24 research-mwaddlink-pipeline-test #190 completed. Result was FAILURE
11:09:24 Build step 'Trigger/call builds on other projects' marked build as failure
11:09:24 Finished: FAILURE

I then need to click on "research-mwaddlink-pipeline-test #190" which takes me to the build page https://integration.wikimedia.org/ci/job/research-mwaddlink-pipeline-test/190/

From there I can click on "Console Output" (https://integration.wikimedia.org/ci/job/research-mwaddlink-pipeline-test/190/console) and there I can see the actual build error.

Event Timeline

My understanding is that this is a security measure to restrict how the pipeline jobs can be triggered, and intentionally isn't avoidable. In the future, with GitLab, the whole of CI will be replaced anyway, so…

thcipriani subscribed.

hrm, pipelinebot leaves a comment with a direct link to console output for image publishing success -- would it suffice to have it leave a comment with a link for test-steps as well? Maybe only in the case of test failure?

hashar subscribed.

It is a bit more complicated. The trigger-* jobs are a workaround a defect in the stack. Namely the Pipeline jobs are not recognized by the Jenkins Gearman plugin which is the connector between Jenkins and the Zuul CI system: the plugin only recognizes normal/legacy kind of jobs. Hence I went with trigger-* jobs which are normal ones and are thus recognized by the Gearman plugin, their only purpose is to then trigger the Pipeline job.

Last time I have dig into it, some organization forked the Gearman plugin and added support for registering pipeline jobs. They also worked on adding support for Java 11 and I eventually want to upgrade our CI Jenkins to Java 11 and thus will have to upgrade the Gearman plugin which in turn will let us get rid of the trigger-* jobs.

So essentially: yes that should be improved but there is a long tail of things to fix up first :)

I have managed to build a fork of the Gearman plugin and it seems to properly register and drive Pipeline based jobs :] T271683#6735980

It fails on Java 11 but that was already the case previously.

thcipriani renamed this task from Simplify access to deployment pipeline logs in Jenkins to Have PipelineBot link directly to build console output on test.Jan 26 2021, 5:13 PM

Change 666352 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Test direct triggering of a Pipeline job

https://gerrit.wikimedia.org/r/666352

Change 666353 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Remove trigger- jobs

https://gerrit.wikimedia.org/r/666353

Change 666353 abandoned by Hashar:
[integration/config@master] Remove trigger- jobs

Reason:
That cant work cause the Pipeline jobs always run on master and then use one off executors that are created dynamically. So it is very much more complicated. I will write a report to the task eventually and most probably decline it.

https://gerrit.wikimedia.org/r/666353

Change 666352 abandoned by Hashar:
[integration/config@master] Test direct triggering of a Pipeline job

Reason:
That cant work cause the Pipeline jobs always run on master and then use one off executors that are created dynamically. So it is very much more complicated. I will write a report to the task eventually and most probably decline it.

https://gerrit.wikimedia.org/r/666352

Change 666352 abandoned by Hashar:
[integration/config@master] Test direct triggering of a Pipeline job

Reason:
That cant work cause the Pipeline jobs always run on master and then use one off executors that are created dynamically. So it is very much more complicated. I will write a report to the task eventually and most probably decline it.

https://gerrit.wikimedia.org/r/666352

Thanks for investigating this and agreed that declining, especially with the GitLab migration coming up, makes sense.

In the meantime T209149: Have linters/tests results show up as comments in files on gerrit is an alternative that would help with most of the underlying problem (why are the linters failing and where exactly should it be fixed), and would be useful for the eventual GitLab migration as well.

The Pipeline jobs are instances of org.jenkinsci.plugins.workflow.job.WorkflowJob. When they get triggered by Jenkins itself (via the trigger-* jobs) it executes them on the master node on FlyWeight executor which is dynamically created, even if the master node has no executors.

The pipeline jobs are definitely recognized by the Gearman plugin. But their assigned label is master and we have no executor for the master, the jobs are thus never registered. We could add executors to it, which would registered them as Gearman workers and expose the pipeline jobs to Zuul. But we would need as many executors on the master as the number of pipeline jobs we want to be able to run concurrently and that seems very very hacky. The actual tasks will then be executed on the actual nodes matching a label though.

The Pipeline jobs are always assigned to the master:

jenkinsci/workflow-job-plugin/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowJob.java
@Override public Label getAssignedLabel() {
    Jenkins j = Jenkins.getInstanceOrNull();
    if (j == null) {
        return null;
    }
    return j.getSelfLabel();
}
jenkins/core/src/main/java/jenkins/model/Jenkins.java
@Override
public LabelAtom getSelfLabel() {
    return getLabelAtom("master");
}

That is how far I have reached my investigation :]

Beside that, if we had a ton of executors on the master, the pipeline jobs will be registered as gearman function. But that sounds complicated and merely a workaround :-\