Page MenuHomePhabricator

deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs
Closed, DuplicatePublic

Description

I had set this custom puppet master to test Thumbor-related puppet changes. It was definitely working last time I used it for that, but according to @hashar the host can't be reached anymore.

root@deployment-imagescaler01:~# puppet agent -tv
...
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: Connection refused - connect(2) for "puppetmaster.thumbor.eqiad.wmflabs" port 8140
# nc puppetmaster.thumbor.eqiad.wmflabs 8140
puppetmaster.thumbor.eqiad.wmflabs [10.68.22.53] 8140 (?) : Connection refused

Event Timeline

hashar updated the task description. (Show Details)

Either the puppet master is down on puppetmaster.thumbor.eqiad.wmflabs or some firewall rule prevents access to it:

# nc puppetmaster.thumbor.eqiad.wmflabs 8140
puppetmaster.thumbor.eqiad.wmflabs [10.68.22.53] 8140 (?) : Connection refused

Note that the firewall rule can be either via ferm (check iptables --list -n) or in labs security rules which iirc prevent access from non project instances. So one would need a rule added in Horizon.

The puppet master appears to be shut down at the moment.

The machine uses the role::puppetmaster::standalone role: https://horizon.wikimedia.org/project/instances/f7a60657-ce9a-436f-8961-5550ef03c4ef/

Apache startup appears to be failing, it must be where the issue started:

gilles@puppetmaster:~$ sudo puppet agent -t
Warning: Setting templatedir is deprecated. See http://links.puppetlabs.com/env-settings-deprecations
   (at /usr/lib/ruby/vendor_ruby/puppet/settings.rb:1139:in `issue_deprecation_warning')
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for puppetmaster.thumbor.eqiad.wmflabs
Info: Applying configuration version '1489406169'
Notice: /Stage[main]/Ssh::Server/File[/etc/ssh/userkeys/root.d]: Not removing directory; use 'force' to override
Notice: /Stage[main]/Ssh::Server/File[/etc/ssh/userkeys/root.d]/ensure: removed
Error: Could not start Service[apache2]: Execution of '/usr/sbin/service apache2 start' returned 1: Job for apache2.service failed. See 'systemctl status apache2.service' and 'journalctl -xn' for details.
Error: /Stage[main]/Apache/Service[apache2]/ensure: change from stopped to running failed: Could not start Service[apache2]: Execution of '/usr/sbin/service apache2 start' returned 1: Job for apache2.
Mar 13 11:59:29 puppetmaster apache2[11688]: AH00526: Syntax error on line 8 of /etc/apache2/sites-enabled/50-puppetmaster-wikimedia-org.conf:
Mar 13 11:59:29 puppetmaster apache2[11688]: Invalid command 'SSLOpenSSLConfCmd', perhaps misspelled or defined by a module not included in the server configuration
Mar 13 11:59:29 puppetmaster apache2[11688]: Action 'configtest' failed.
Mar 13 11:59:29 puppetmaster apache2[11688]: The Apache error log may have more information.
Mar 13 11:59:29 puppetmaster systemd[1]: apache2.service: control process exited, code=exited status=1

https://httpd.apache.org/docs/trunk/mod/mod_ssl.html#sslopensslconfcmd

Available in httpd 2.4.8 and later, if using OpenSSL 1.0.2 or later

gilles@puppetmaster:~$ dpkg -l apache2 openssl
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                             Version               Architecture          Description
+++-================================-=====================-=====================-=====================================================================
ii  apache2                          2.4.10-10+deb8u8      amd64                 Apache HTTP Server
ii  openssl                          1.0.2j-1~wmf1         amd64                 Secure Sockets Layer toolkit - cryptographic utility

Updating apache2 as indicated in that ticket fixed the issue. With apache restored on puppetmaster.thumbor puppet ran fine on deployment-imagescaler01