Page MenuHomePhabricator

Cannot update CI Jenkins jobs
Closed, ResolvedPublicBUG REPORT

Description

SRE has decided to block traffic that lacks an user-agent T400119, as a result, we can't update the CI Jenkins jobs anymore:

./jjb-update
...
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://integration.wikimedia.org/ci/crumbIssuer/api/json

jenkins.JenkinsException: Error in request. Possibly authentication failed [403]: Forbidden
Please set a user-agent and respect our robot policy https://w.wiki/4wJS. See also T400119.

Event Timeline

ssingh subscribed.

We are looking into this, thanks for flagging.

@hashar: As the message indicates, it would be preferable if we can set a user-agent as a start, even for an internal traffic requirement like this. Is that something that can be easily done in this case?

Reading into ./jjb-update and https://gerrit.wikimedia.org/r/plugins/gitiles/integration/config/+/refs/heads/master, it seems like this is being run locally and so will no longer work without a user-agent being set. I erroneously assumed that this was internal traffic. Given that that is not the case, please set a user-agent (see the policy and ask us if more clarification is required) and that should fix the issue.

Change #1182613 had a related patch set uploaded (by CDanis; author: CDanis):

[operations/debs/wmf-laptop@master] tunnelencabulator: add integration.wm.o

https://gerrit.wikimedia.org/r/1182613

The code base is legacy and I am not sure how to monkey patch it (python-jenkins:jenkins/__init__.py). It uses python requests, maybe it recognizes an environment variables of some sort, I haven't checked.

The requests are authenticated using basic auth and Authorization header, maybe that is sufficient to let our system to allow them. Then the basic auth might just be to get a token that is then passed in following requests as another header. So that might not work. Can integration.wikimedia.org/ci be opted out temporarily while I figure it out?

As one option for a workaround, running tunnelencabulator --tunnel-everything after applying the above patchset will allow you to bypass this temporarily.

Change #1182613 merged by CDanis:

[operations/debs/wmf-laptop@master] tunnelencabulator: add integration.wm.o

https://gerrit.wikimedia.org/r/1182613

The code base is legacy and I am not sure how to monkey patch it (python-jenkins:jenkins/__init__.py). It uses python requests, maybe it recognizes an environment variables of some sort, I haven't checked.

Thanks for sharing. I may be mistaken given that I don't know the full extent of the code but it seems like extending https://opendev.org/jjb/python-jenkins/src/branch/master/jenkins/__init__.py#L96 and setting:

DEFAULT_HEADERS = {'Content-Type': 'text/xml; charset=utf-8', 'User-Agent': 'jjb-update/1.0'}

should work for all invocations of the requests library?

The requests are authenticated using basic auth and Authorization header, maybe that is sufficient to let our system to allow them. Then the basic auth might just be to get a token that is then passed in following requests as another header. So that might not work. Can integration.wikimedia.org/ci be opted out temporarily while I figure it out?

It's a bit of a work to carve out exceptions especially for locally-run things so let's try the above perhaps and see where it leads us? I am sorry that this prevents you from running it.

I have filed that as an unbreak now cause that prevented me from deploying updates to jobs and I wasn't aware we would start blocking traffic. I did see some allusion to it but thought it was for the wiki traffic.

The easiest would be to opt out integration.wikimedia.org to restore the previous behavior, then that apparently involves hacking in the vlc and certainly adds an exemptions.

Sukhbir mentioned to me some alternatives:

  1. tunnelencabulator, I have never heard of it before, but from the name I can imagine its purpose.
  2. running from the production infra: I think I have tried that at some point (running it from the deployment server) but the setup cost was not worth the effort
  3. setting an user-agent: that is the low hanging fruit and that is certainly a legitimate ask.

Do note I wasn't aware of the block and that comes as a surprise when I had planned other duties this week (unfortunately involving updating CI jobs).

I don't know yet how to monkey patch python-jenkins or Jenkins job builder. I will dig in the code in order to have an user agent set.

Change #1182623 had a related patch set uploaded (by Jforrester; author: Jforrester):

[integration/config@master] [WIP] Fix jenkins-jobs script to set a user agent

https://gerrit.wikimedia.org/r/1182623

Change #1182623 merged by jenkins-bot:

[integration/config@master] Fix jenkins-jobs script to set a user agent

https://gerrit.wikimedia.org/r/1182623

Should now be Resolved.

I have filed that as an unbreak now cause that prevented me from deploying updates to jobs and I wasn't aware we would start blocking traffic. I did see some allusion to it but thought it was for the wiki traffic.

The easiest would be to opt out integration.wikimedia.org to restore the previous behavior, then that apparently involves hacking in the vlc and certainly adds an exemptions.

Sukhbir mentioned to me some alternatives:

  1. tunnelencabulator, I have never heard of it before, but from the name I can imagine its purpose.
  2. running from the production infra: I think I have tried that at some point (running it from the deployment server) but the setup cost was not worth the effort
  3. setting an user-agent: that is the low hanging fruit and that is certainly a legitimate ask.

Do note I wasn't aware of the block and that comes as a surprise when I had planned other duties this week (unfortunately involving updating CI jobs).

I don't know yet how to monkey patch python-jenkins or Jenkins job builder. I will dig in the code in order to have an user agent set.

Thanks for fixing it. Sorry that this caught you by surprise -- I am all ears on behalf of Traffic on what we could have done to better communicate this.

Note for other stuff that you may own/be involved in: anything that is behind the CDN will get rate-limited, unless it is internal traffic or traffic from Wikimedia Enterprise. So if that's the case, please do consider updating the UA so you are not caught by surprise.

The script used to deploy CI job uses python-jenkins. We found out it reads the environment variable JENKINS_API_EXTRA_HEADERS which, as the name implies, adds extra headers to the requests. Thus James Forrester went with:

JENKINS_API_EXTRA_HEADERS=User-Agent:wikimedia-jjb-update/1.0

And Timo validated it. There is no need to update any upstream code (though surely python-jenkins should have a default user-agent.

Sorry that this caught you by surprise -- I am all ears on behalf of Traffic on what we could have done to better communicate this.

I kind of got caught off guard, then there are so many large changes happening all other the infrastructure that it is hard to keep up with all of them and surely we don't want to over communicate and overwhelm everything.

The good points are:

  • the 403 message points to the phab task saving precious time
  • the fix is easy: just set an user-agent. Where I freaked out is that I felt it would be some sizable amount of hours to have it done with the legacy software we use, turns out it was "just" about setting an env variable :-]
  • @CDanis and @ssingh reached out immediately 🏆

anything that is behind the CDN will get rate-limited, unless it is internal traffic or traffic from Wikimedia Enterprise.

Dully noted, thanks!

There would be bunch of external requests for https://integration.wikimedia.org/zuul/ and specially https://integration.wikimedia.org/zuul/status.json . But those comes from user browsers and would be considered "normal" traffic.

Thanks for the assistance and to @Jdforrester-WMF and @Krinkle for the fix and its validation.

I have send a patch Upstream to have an user-agent added: https://review.opendev.org/c/jjb/python-jenkins/+/958730 which would add python-jenkins/1.8.3 to all requests.

bd808 changed the subtype of this task from "Task" to "Bug Report".Aug 28 2025, 2:59 PM