Page MenuHomePhabricator

Can not ssh to beta cluster instance deployment-apertium01
Closed, ResolvedPublic

Description

Ssh connection to deployment-apertium01.deployment-prep.eqiad.wmflabs times out :(

Event Timeline

hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)
hashar added subscribers: Aklapper, Luke081515, hashar and 2 others.

I have soft rebooted the instance via Horizon.

The instance is unreachable by SSH. OpenSSH did start but the 22 port does not respond. Maybe the instance is firewalled :-/

Maybe the best would be to rebuild the apertium service on beta cluster ?

Looks like there are IPTables rules in place preventing login. Running sudo salt 'deployment-apertium01*' cmd.run 'iptables -I INPUT -p tcp --dport 22 -s 10.68.17.232 -j ACCEPT' from deployment-salt allows logins, but I'm guessing only until the next puppet run...

hashar claimed this task.

TL;DR: ferm still present with outdated conf files. Removed ferm

Ah I tried accessing with salt yesterday but it did not work for some reason :(

I had puppet disable when I changed the beta cluster puppet master. A catalog from Thu Jul 23 08:10:48 2015 UTC failed with:

Error 403 on SERVER: Forbidden request: deployment-apertium01.deployment-prep.eqiad.wmflabs(10.68.16.79) access to /file_metadata/plugins

The next run with a catalog from Mon Jul 27 21:10:42 2015 UTC passed just fine.

Maybe the ferm rules were outdated / not taken in account. Puppet is running fine now and I managed to log on the instance.

I have rebooted the instance. The ferm rule does not allow ssh from bastion.wmflabs.org (bastion-01 10.68.17.232 ) although it is listed in manifests/network.pp

Stopped ferm to regain access. /etc/ferm/conf.d/00_defs is missing the IP. I deleted it and ran puppet again but it is not regenerated.

It seems ferm used to be applied on the instance but base::firewall has been carelessly removed in the puppet definition. End result is ferm is still present with configuration which is never updated. I have uninstalled it:

dpkg --purge ferm
rm -fR /etc/ferm /var/cache/ferm

It is all good now.