Webproxy on carbon unreachable from labs instances since Dec 24 roughly 1am
Closed, InvalidPublic
Actions

Assigned To

Authored By

	hashar
	Dec 26 2015, 9:17 PM

Description

Since December 24th at roughly 1am, a bunch of CI jobs have been falling (T122449). It seems to be caused by the web proxy webproxy.eqiad.wmnet on port 8080 to no more be reachable from labs instances.

integration-slave-trusty-1011:~$ curl --verbose -4 --proxy webproxy.eqiad.wmnet:8080 https://meta.wikimedia.org/wiki/Main_Page
* Hostname was NOT found in DNS cache
*   Trying 208.80.154.10...

I can ping carbon just fine from labs.

It works fine from a production host such as gallium:

Looking at Icinga for Carbon:

Service	Status	Duration	Message
Squid	OK	93d 10h 25m 16s	TCP OK - 0.001 second response time on port 8080
Puppet	Warning	2d 17h 31m 19s	WARNING: Puppet is currently disabled, last run 2 days ago with 0 failures

Which seems to correlate with the starts of CI issues.

So it seems to me carbon has some live hack (puppet is disabled) that prevents its web proxy to be reachable from labs :-/

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		hashar	T122449 Jenkins job mediawiki-extensions-qunit Karma timeout on odd-numbered build slaves
		Invalid		faidon	T122461 Webproxy on carbon unreachable from labs instances since Dec 24 roughly 1am

Event Timeline

hashar created this task.Dec 26 2015, 9:17 PM

hashar raised the priority of this task from to High.

hashar updated the task description. (Show Details)

hashar added projects: Continuous-Integration-Infrastructure, SRE.

hashar added subscribers: Paladox, gerritbot, hashar and 2 others.

See T122368. Why do you need to use the webproxy? Labs instances have Internet connectivity via NAT so the webproxy shouldn't be needed.

We have pointed the MediaWiki configuration on CI to a proxy because we had some hosts that had no direct access to internet (prod slaves in 10.0.0.0/8). It is no more the case nowadays though so I will just get rid of the proxy.

maven was still being routed via webproxy: T122594

hashar mentioned this in T122594: WDQS builds fail due to network issues.Jan 4 2016, 10:34 AM

Webproxy on carbon unreachable from labs instances since Dec 24 roughly 1amClosed, InvalidPublicActions

Description

Related ObjectsSearch...

Event Timeline

Webproxy on carbon unreachable from labs instances since Dec 24 roughly 1am
Closed, InvalidPublic
Actions

Related Objects
Search...