Since Friday Feb 6th 23:30 UTC, puppet runs on instance of the integration labs project started failing. The instances puppet agent point to integration-puppetmaster.eqiad.wmflabs but timeout connecting to it. The puppet agent local to integration-puppetmaster works fine though.
On one of the instance:
Warning: Unable to fetch my node definition, but the agent run will continue: Warning: Connection timed out - connect(2) Info: Retrieving plugin ... Error: Could not retrieve catalog from remote server: Connection timed out - connect(2) Notice: Using cached catalog Info: Applying configuration version '1423261026'
Additionally the Jenkins master on gallium ( 220.127.116.11 ) is no more able to ssh to integration-slave1001.eqiad.wmflabs ( 10.68.16.60 ) after I rebooted the instance.
gallium:~$ telnet 10.68.16.60 22 Trying 10.68.16.60... telnet: Unable to connect to remote host: Connection timed out
Though I can ssh to it from the labs bastion. The instance has some iptables rules but they allow connections from gallium on port 22 (ssh).