Page MenuHomePhabricator

Upgrade Jenkins from 1.642.3 to 1.651.2
Closed, ResolvedPublic

Description

A new LTS version 1.651.2 got released and we should upgrade.

Changelog: https://jenkins.io/changelog-stable/

Upgrade process is detailed at: https://wikitech.wikimedia.org/wiki/Jenkins We merely use rerepro to update from http://pkg.jenkins-ci.org/debian-stable/

Details

Related Gerrit Patches:
operations/puppet : productionjenkins: allow unsafe parameters

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 26 2016, 7:43 PM

I went through the changelog , that seems harmless.

Daniel Beck (upstream) wrote:

We will publish new Jenkins releases (mainline and 1.651.2 LTS) on Wednesday May 11. These releases will contain fixes for security issues found in current versions of Jenkins. The security advisory will be issued at the same time to provide further information.

Package is available on apt.wikimedia.org

hashar renamed this task from Upgrade Jenkins from 1.642.3 to 1.651.1 to Upgrade Jenkins from 1.642.3 to 1.651.2.May 12 2016, 2:49 PM
hashar updated the task description. (Show Details)
hashar added a comment.EditedMay 12 2016, 2:53 PM

1.651.1 should be fine.

The security release 1.651.2 make it so jobs to use a whitelisted set of parameters that can be passed to jobs. See upstream security note: https://wiki.jenkins-ci.org/display/JENKINS/Plugins+affected+by+fix+for+SECURITY-170

That surely break at least:

  • the Gearman plugin which recognizes OFFLINE_NODE_WHEN_COMPLETE
  • Zuul passing arbitrary parameters when triggering the German functions. Consisting of ZUUL_* parameters and whatever we define in integration/config.git:zuul/parameter_functions.py

That might have an impact on jobs triggered from a parent one such as castor-save or publish-on-gallium although both have predefined parameters.

TL;DR: the Gearman plugin properly put the slave offline on Jenkins 1.651.2 even if OFFLINE_NODE_WHEN_COMPLETE is stripped from the build parameters.


To verify whether OFFLINE_NODE_WHEN_COMPLETE is still recognized by the Gearman plugin, I have setup a basic install on my Mac.

Install Gearman / Jenkins:

brew install gearman
brew install homebrew/versions/jenkins-lts
gearmand -l stderr --verbose INFO
jenkins-lts

Browse the Jenkins interface http://localhost:8080/ . I have made the master to have 0 executor.

Created a slave with a command executed on master to spawn it:

java -jar /Users/hashar/.jenkins/war/WEB-INF/slave.jar

Created a job basic-job, that is restricted to run on slave. With a single parameter: DEFINED_PARAM. Add a shell build step that simply runs export.

I then use the gearman command like client to run the job. Our basic-job is exposed as the Gearman function build:basic-job and the Jenkins Gearman plugin expect the data blob to be a json encoded dict.

$ echo '{"DEFINED_PARAM":3,"UNDEFINED_PARAM":777,"OFFLINE_NODE_WHEN_COMPLETE":1}'|gearman -h 127.0.0.1 -p 4730 -v -f build:basic-job
Task created: H:aeriale.local:8
{
   "name" : "basic-job",
   "node_name" : "",
   "worker" : "slave_exec-0",
   "manager" : "X.Y.Z.D",
   "node_labels" : [
      "master"
   ],
   "url" : "http://localhost:8080/job/basic-job/7/",
   "number" : 7
}
%1 Complete
{
   "result" : "SUCCESS",
   "manager" : "X.Y.Z.D",
   "worker" : "slave_exec-0",
   "node_name" : "slave",
   "number" : 7,
   "url" : "http://localhost:8080/job/basic-job/7/",
   "name" : "basic-job",
   "node_labels" : [
      "slave"
   ]
}

The job build page in Jenkins shows solely the parameter that is defined in the job i.e: DEFINED_PARAM=3.

And the export command does have:

export DEFINED_PARAM="3"

Neither UNDEFINED_PARAM nor OFFLINE_NODE_WHEN_COMPLETE are exported as environment variable which is the whole point of 1.651.2 security release.


Now. slave slave went offline properly with the usual message which we see on the production ci-* nodes:

Disconnected by anonymous : Offline due to Gearman request

From Jenkins log:

Infos: ---- Worker slave_exec-0 received unique job assignment
Infos: ---- Worker slave_exec-0 executing function
Infos: ---- Worker slave_exec-0 scheduling basic-job build #7 on slave
with UUID C2034BB0-69E3-4DBC-85B2-93C1135C24CA and build params [
(TextParameterValue) DEFINED_PARAM='3',
(TextParameterValue) UNDEFINED_PARAM='777',
(TextParameterValue) OFFLINE_NODE_WHEN_COMPLETE='1'
]

Ie the Gearman plugin receives the pristine parameters and run the task:

submitTask
Infos: Session GearmanJobServerSession:139:GearmanNIOJobServerConnection:localhost/
127.0.0.1:4730 is now handling the task
127.0.0.1:4730 handling a REQ/WORK_DATA event
127.0.0.1:4730 handling a REQ/WORK_STATUS event

Then the Jenkins parameter stripper kicks in:

hudson.model.ParametersAction filter
Skipped parameter `UNDEFINED_PARAM` as it is undefined on `basic-job`.
Skipped parameter `OFFLINE_NODE_WHEN_COMPLETE` as it is undefined on `basic-job`.

With a helpful message:

Set -Dhudson.model.ParametersAction.keepUndefinedParameters=true to allow undefined parameters to be injected as environment variables or -Dhudson.model.ParametersAction.safeParameters=[comma-separated list] to whitelist specific parameter names, even though it represents a security breach

Anyway the job runs just fine and the slave is put offline:

Infos: basic-job #7 main build action completed: SUCCESS
Infos: ---- Worker slave_exec-0 setting node offline.
Infos: ---- hudson.plugins.gearman.ComputerListenerImpl: onTemporarilyOffline computer hudson.slaves.SlaveComputer@5d7cceeb
Infos: ---- Worker slave_exec-0 registering 0 functions

Zuul pass a bunch of parameters and we would need them to be either added to all jobs or whitelisted on the command line.

Meanwhile maybe we can set -Dhudson.model.ParametersAction.keepUndefinedParameters=true and see whether it collects missing parameters.

I have poked OpenStack infrastructure team (maintainers of Zuul/Nodepool/Gearman plugin) on http://lists.openstack.org/pipermail/openstack-infra/2016-May/004284.html

Hello,
Jenkins has released a security updated on Wednesday which causes it to
drop parameters passed to a job unless they are explicitly defined in
the job. The announce is at:
https://wiki.jenkins-ci.org/display/JENKINS/Plugins+affected+by+fix+for+SECURITY-170
That affects Zuul/Nodepooletc
Zuul pass a range of built-in parameters (eg: ZUUL_PROJECT) and can
inject user defined ones via the parameters functions. All of them ends
up being dropped and are no more known to the job.
A good news though is that the Gearman Jenkins plugin still recognizes
"OFFLINE_NODE_WHEN_COMPLETE" (which might itself be a bug/security
issue). So at least the slave is put offline.
I have documented my test extensively on:

https://phabricator.wikimedia.org/T133737#2290669

The easiest (and insecure) fix is to keep the old behaviour by passing
to Jenkins:

-Dhudson.model.ParametersAction.keepUndefinedParameters=true

If one assumes the Gearman requests are safe, the plugin might be able
to dynamically whitelist them so they get passed to the job as env
variables.
Alternatively, one would have to make sure the parameters coming from
Zuul are predefined in the job. It might be quite challenging to align
Zuul code, parameter functions and the JJB definitions.

hashar triaged this task as High priority.

James E. Blair (OpenStack) kind replied on http://lists.openstack.org/pipermail/openstack-infra/2016-May/004285.html

Yes, we assume the parameters passed in via gearman are safe, as they are provided either by zuul directly, or indirectly by custom functions in zuul's configuration managed by the zuul system administrator. So this was a feature in Jenkins on which we relied. I think it makes the most sense for the gearman plugin to be updated to autowhitelist them if that is possible. Is someone interested in working on that?
In the mean time, assuming that your system is entirely driven by Zuul+gearman and you do not have jobs that are triggered by other plugins where this behavior might not be desirable, I think the command line option you mentioned should be safe.
-Jim

On our setup jobs can be triggered manually but it requires the user to be in the wmf/nda/wmde LDAP groups.

We have some browsertests injecting env variable from repositories (via a yaml file in the repos), but they are run on changes that got merged.

So I am going to disable the Jenkins security feature by passing -Dhudson.model.ParametersAction.keepUndefinedParameters=true as documented on https://wiki.jenkins-ci.org/display/SECURITY/Jenkins+Security+Advisory+2016-05-11 and blog post https://jenkins.io/blog/2016/05/11/security-update/

Change 289165 had a related patch set uploaded (by Hashar):
jenkins: allow unsafe parameters

https://gerrit.wikimedia.org/r/289165

Change 289165 merged by Filippo Giunchedi:
jenkins: allow unsafe parameters

https://gerrit.wikimedia.org/r/289165

Mentioned in SAL [2016-05-17T13:32:56Z] <hashar> upgrading Jenkins T133737

hashar closed this task as Resolved.May 17 2016, 1:59 PM
hashar added a subscriber: fgiunchedi.

Jenkins is upgraded and the build parameters are kept properly (thank you @fgiunchedi for the quick merge).

I have filled T135506 to get rid of the workaround which depends on upgrading the Gearman plugin.