Page MenuHomePhabricator

CI docker container fails to resolve DNS name: fatal: Unable to look up contint1001.wikimedia.org (port 9418) (Temporary failure in name resolution)
Closed, ResolvedPublic

Description

The CI Docker containers are no more able to do DNS lookups, and for example fails to git fetch from contint1001.wikimedia.org.

$ ssh integration-agent-docker-1001.integration.eqiad.wmflabs
$ sudo docker run --rm -it --entrypoint=bash docker-registry.wikimedia.org/releng/ci-src-setup-simple:0.2.1

$ cat /etc/resolv.conf 
## THIS FILE IS MANAGED BY PUPPET
##
## source: modules/base/resolv.conf.labs.erb
## from:   base::resolving

domain integration.eqiad.wmflabs
search integration.eqiad.wmflabs eqiad.wmflabs 
nameserver 208.80.154.143
nameserver 208.80.154.24
options timeout:2 ndots:1

$ cd /src
$ git init .
$  git fetch git://contint1001.wikimedia.org/mediawiki/core
fatal: Unable to look up contint1001.wikimedia.org (port 9418) (Temporary failure in name resolution)

Event Timeline

hashar triaged this task as Unbreak Now! priority.Sep 30 2019, 10:47 AM

Mentioned in SAL (#wikimedia-releng) [2019-09-30T10:59:39Z] <hashar> Restarted Docker on integration-agent-docker-1001 T234197

Mentioned in SAL (#wikimedia-operations) [2019-09-30T11:08:02Z] <hashar> Restarting Docker on CI agents to clear out some docker/iptables oddity # T234197

Mentioned in SAL (#wikimedia-releng) [2019-09-30T11:08:05Z] <hashar> Restarting Docker on CI agents to clear out some docker/iptables oddity # T234197

hashar claimed this task.

Somehow the ferm Debian package has been installed on all wmcs instances. It is an utility to manages iptables rules. The package has been removed, but Docker needed a restart for some unknown reason.

I have manually restarted Docker on all instances, so it should be fine now.

Mentioned in SAL (#wikimedia-operations) [2019-09-30T11:20:47Z] <hashar> Restarting Docker on integration-agent-puppet-docker-1001 # T234197