Page MenuHomePhabricator

tools-docker-builder-05.tools.eqiad.wmflabs firewall rules don't allow ssh from tools-clushmaster-01.eqiad.wmflabs
Closed, ResolvedPublic

Description

Today I was running clush -w @all 'sudo puppet agent --test' from the host tools-clushmaster-01.eqiad.wmflabs and some issues happened in the output.

In case of tools-docker-builder-05.tools.eqiad.wmflabs, this was:

[...]
tools-checker-01.tools.eqiad.wmflabs: Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
tools-docker-builder-05.tools.eqiad.wmflabs: ssh: connect to host tools-docker-builder-05.tools.eqiad.wmflabs port 22: Connection timed out
clush: tools-docker-builder-05.tools.eqiad.wmflabs: exited with exit code 255
tools-exec-1403.tools.eqiad.wmflabs: Info: Caching catalog for tools-exec-1403.tools.eqiad.wmflabs
[...]

@chasemp suggested that this is probably just a dead node.

Related to this issue (same clush run): T179387, T179386

Event Timeline

I can ssh into tools-docker-builder-05.tools.eqiad.wmflabs. I forced a puppet run there and it was a no-op.

tools-docker-builder-05.tools:~
bd808$ sudo -i puppet agent --test --verbose
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for tools-docker-builder-05.tools.eqiad.wmflabs
Info: Applying configuration version '1509461821'
Notice: Finished catalog run in 7.07 seconds

The iptables firewall on tools-docker-builder-05 is restricting access to port 22:

$ sudo iptables -L -n|grep 22
ACCEPT     tcp  --  10.68.17.232         0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.65          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.66          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.68          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.21.162         0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.17.221         0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.22.61          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.245         0.0.0.0/0            tcp dpt:22

I think these are the rules generated by Ferm::Rule['bastion-ssh'] in ::base::firewall.

bd808 renamed this task from puppet agent issue with tools-docker-builder-05.tools.eqiad.wmflabs to tools-docker-builder-05.tools.eqiad.wmflabs firewall rules don't allow ssh from tools-clushmaster-01.eqiad.wmflabs.Nov 3 2017, 10:06 PM

Today I did a puppet run and saw the same for another instance:

tools-services-01.tools.eqiad.wmflabs: ssh: connect to host tools-services-01.tools.eqiad.wmflabs port 22: Connection timed out

I will take a look.

aborrero triaged this task as Medium priority.Jan 19 2018, 12:46 PM
Stashbot added a subscriber: Stashbot.

Mentioned in SAL (#wikimedia-cloud) [2018-01-19T12:56:55Z] <arturo> the puppet status across the fleet seems good, only minor things like T185314 , T179388 and T179386

The iptables firewall on tools-docker-builder-05 is restricting access to port 22:

$ sudo iptables -L -n|grep 22
ACCEPT     tcp  --  10.68.17.232         0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.65          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.66          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.68          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.21.162         0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.17.221         0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.22.61          0.0.0.0/0            tcp dpt:22
ACCEPT     tcp  --  10.68.18.245         0.0.0.0/0            tcp dpt:22

I think these are the rules generated by Ferm::Rule['bastion-ssh'] in ::base::firewall.

Now there are no such rules in this server:

aborrero@tools-docker-builder-05:~$ sudo iptables-save
# Generated by iptables-save v1.4.21 on Fri Jan 26 17:48:50 2018
*nat
:PREROUTING ACCEPT [92421:9516896]
:INPUT ACCEPT [58845:7152936]
:OUTPUT ACCEPT [116012:9579714]
:POSTROUTING ACCEPT [116012:9579714]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
COMMIT
# Completed on Fri Jan 26 17:48:50 2018
# Generated by iptables-save v1.4.21 on Fri Jan 26 17:48:50 2018
*filter
:INPUT ACCEPT [1636254:709317301]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [2056413:823585058]
:DOCKER - [0:0]
:DOCKER-ISOLATION - [0:0]
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION -j RETURN
COMMIT
# Completed on Fri Jan 26 17:48:50 2018

The problem now to connect to tools-docker-builder-05 from tools-clushmaster-01 seems to be related to pubkey:

aborrero@tools-clushmaster-01:~$ ssh tools-docker-builder-05.eqiad.wmflabs
Permission denied (publickey,hostbased).

but first, it would be good to know if this is in fact a dead node or not.

As of today, this issue is no longer present.