Page MenuHomePhabricator

CI docker container fails to resolve DNS name: fatal: Unable to look up contint1001.wikimedia.org (port 9418) (Temporary failure in name resolution)
Closed, ResolvedPublic

Description

The CI Docker containers are no more able to do DNS lookups, and for example fails to git fetch from contint1001.wikimedia.org.

$ ssh integration-agent-docker-1001.integration.eqiad.wmflabs
$ sudo docker run --rm -it --entrypoint=bash docker-registry.wikimedia.org/releng/ci-src-setup-simple:0.2.1

$ cat /etc/resolv.conf 
## THIS FILE IS MANAGED BY PUPPET
##
## source: modules/base/resolv.conf.labs.erb
## from:   base::resolving

domain integration.eqiad.wmflabs
search integration.eqiad.wmflabs eqiad.wmflabs 
nameserver 208.80.154.143
nameserver 208.80.154.24
options timeout:2 ndots:1

$ cd /src
$ git init .
$  git fetch git://contint1001.wikimedia.org/mediawiki/core
fatal: Unable to look up contint1001.wikimedia.org (port 9418) (Temporary failure in name resolution)

Event Timeline

hashar created this task.Sep 30 2019, 10:47 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 30 2019, 10:47 AM
hashar triaged this task as Unbreak Now! priority.Sep 30 2019, 10:47 AM
Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptSep 30 2019, 10:47 AM
hashar updated the task description. (Show Details)Sep 30 2019, 10:51 AM
jbond added a subscriber: jbond.Sep 30 2019, 10:54 AM

Mentioned in SAL (#wikimedia-releng) [2019-09-30T10:59:39Z] <hashar> Restarted Docker on integration-agent-docker-1001 T234197

Mentioned in SAL (#wikimedia-operations) [2019-09-30T11:08:02Z] <hashar> Restarting Docker on CI agents to clear out some docker/iptables oddity # T234197

Mentioned in SAL (#wikimedia-releng) [2019-09-30T11:08:05Z] <hashar> Restarting Docker on CI agents to clear out some docker/iptables oddity # T234197

hashar closed this task as Resolved.Sep 30 2019, 11:15 AM
hashar claimed this task.

Somehow the ferm Debian package has been installed on all wmcs instances. It is an utility to manages iptables rules. The package has been removed, but Docker needed a restart for some unknown reason.

I have manually restarted Docker on all instances, so it should be fine now.

Mentioned in SAL (#wikimedia-operations) [2019-09-30T11:20:47Z] <hashar> Restarting Docker on integration-agent-puppet-docker-1001 # T234197