Page MenuHomePhabricator

Jenkins Web UI error - Backend fetch failed
Open, NormalPublic

Description

while viewing https://integration.wikimedia.org/ci/view/Default/job/beta-scap-eqiad/149900/console whilist it was in progress i get this
Request from Redacted IP via cp1051 cp1051, Varnish XID 136209466
Error: 503, Backend fetch failed at Fri, 07 Apr 2017 23:15:31 GMT

On contint1001:

[proxy_http:error] [pid 634] (20014)
   Internal error: [client 10.64.32.103:50115]
   AH01102: error reading status line from remote server localhost:8080,
      referer: https://integration.wikimedia.org/ci/view/Default/job/beta-scap-eqiad/149900/console

It is somehow rare but annoying:

zgrep -c AH01102 /var/log/apache2/integration_error.log*|sort -t . -k 3 -rn
/var/log/apache2/integration_error.log.30.gz:20
/var/log/apache2/integration_error.log.29.gz:12
/var/log/apache2/integration_error.log.28.gz:10
/var/log/apache2/integration_error.log.27.gz:19
/var/log/apache2/integration_error.log.26.gz:20
/var/log/apache2/integration_error.log.25.gz:5
/var/log/apache2/integration_error.log.24.gz:3
/var/log/apache2/integration_error.log.23.gz:21
/var/log/apache2/integration_error.log.22.gz:15
/var/log/apache2/integration_error.log.21.gz:19
/var/log/apache2/integration_error.log.20.gz:28
/var/log/apache2/integration_error.log.19.gz:16
/var/log/apache2/integration_error.log.18.gz:13
/var/log/apache2/integration_error.log.17.gz:8
/var/log/apache2/integration_error.log.16.gz:27
/var/log/apache2/integration_error.log.15.gz:16
/var/log/apache2/integration_error.log.14.gz:21
/var/log/apache2/integration_error.log.13.gz:21
/var/log/apache2/integration_error.log.12.gz:9
/var/log/apache2/integration_error.log.11.gz:6
/var/log/apache2/integration_error.log.10.gz:1
/var/log/apache2/integration_error.log.9.gz:19
/var/log/apache2/integration_error.log.8.gz:18
/var/log/apache2/integration_error.log.7.gz:25
/var/log/apache2/integration_error.log.6.gz:48
/var/log/apache2/integration_error.log.5.gz:24
/var/log/apache2/integration_error.log.4.gz:3
/var/log/apache2/integration_error.log.3.gz:4
/var/log/apache2/integration_error.log.2.gz:16
/var/log/apache2/integration_error.log.1:30
/var/log/apache2/integration_error.log:0

Event Timeline

Zppix created this task.Apr 7 2017, 11:17 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 7 2017, 11:17 PM
Paladox added a subscriber: Paladox.EditedApr 7 2017, 11:18 PM

This always happens. Workaround is refresh the page. Varnish looks like the problem.

Paladox edited projects, added Operations; removed Jenkins.Apr 7 2017, 11:18 PM
Paladox added a project: Jenkins.
Zppix added a comment.Apr 7 2017, 11:19 PM

Instead of constant work around why dont we fix it?

It would be good to fix that problem.

Dzahn added a subscriber: Dzahn.Apr 7 2017, 11:22 PM

Varnish looks like the problem.

I don't think so. Varnish says "Backend fetch failed". The Backend is integration.wm.org.

greg renamed this task from Jenkins Web UI error to Jenkins Web UI error - Backend fetch failed .Apr 11 2017, 9:03 PM
greg edited projects, added Continuous-Integration-Infrastructure; removed Operations.
hashar updated the task description. (Show Details)Apr 12 2017, 7:46 AM
hashar added a subscriber: hashar.Apr 12 2017, 8:12 AM

Yup that has happened for age and in my experience solely when a console is emitting output. User requests are handled by the misc cache (Nginx/Varnish) -> Apache mod_proxy -> Jenkins web service.

From contint1001 Apache log in /var/log/apache2/integration_error.log

[Fri Apr 07 23:15:31.491798 2017] [proxy_http:error] [pid 634] (20014)
   Internal error: [client 10.64.32.103:50115]
   AH01102: error reading status line from remote server localhost:8080,
      referer: https://integration.wikimedia.org/ci/view/Default/job/beta-scap-eqiad/149900/console

With:

  • 10.64.32.103 : the Varnish cache cp1051.eqiad.wmnet
  • localhost:8080 the Jenkins web service

In the Jenkins web service access log:

[07/Apr/2017:23:15:30 +0000] "POST /ci/view/Default/job/beta-scap-eqiad/149900/logText/progressiveHtml HTTP/1.1" 200 0
[07/Apr/2017:23:15:31 +0000] "POST /ci/view/Default/job/beta-scap-eqiad/149900/logText/progressiveHtml HTTP/1.1" 200 0

Maybe Jenkins is overflowed with requests. Specially the proxy configuration does not have any specific setting beside:

ProxyPass           /ci http://localhost:8080/ci
ProxyPassReverse    /ci http://localhost:8080/ci
ProxyRequests       Off

Maybe Apache mod proxy should pool/reuse connections. From  ProxyPass documentation, additional setting scan be passed to finely tweak the # of connections. There is a table with all supported settings.

hashar triaged this task as Normal priority.Apr 12 2017, 8:29 AM

I dont really know what is happening, lack of logs beside AH01102: error reading status line from remote server localhost:8080 doesn't give much explanations as to what is actually happening :-(

Zppix added a comment.Apr 16 2017, 4:07 PM

@hashar is there any further info I could provide from my end of things?

Jenkins uses a rather old version of jetty (the http service to allow us to see the website). It uses 8. Jenkins 2.x+ uses 9 so there could be a fix that may fix this.

Paladox added a comment.EditedApr 29 2017, 12:43 PM

See https://bz.apache.org/bugzilla/show_bug.cgi?id=37770 and https://access.redhat.com/solutions/54579

We could try setting the timeout at 600 or 900?

Also we have to set the noncat in the proxypass per jenkins docs.

This could be a keepalive problem?

http://apache-http-server.18135.x6.nabble.com/Proxy-Error-td4754490.html

Please, no random speculations :)

Restricted Application added a project: User-Zppix. · View Herald TranscriptJun 12 2019, 11:52 PM