Page MenuHomePhabricator

Puppet failing with certificate errors on deployment-prep
Closed, ResolvedPublic

Description

root@deployment-salt:/var/lib/git/operations/puppet (production) # puppet agent -tv
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: i-0000015c.eqiad.wmflabs]
Info: Retrieving plugin
Error: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: i-0000015c.eqiad.wmflabs]
Error: /File[/var/lib/puppet/lib]: Could not evaluate: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: i-0000015c.eqiad.wmflabs] Could not retrieve file metadata for puppet://deployment-salt.eqiad.wmflabs/plugins: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: i-0000015c.eqiad.wmflabs]
Info: Loading facts in /etc/puppet/modules/base/lib/facter/ec2id.rb
Info: Loading facts in /etc/puppet/modules/base/lib/facter/physicalcorecount.rb
Info: Loading facts in /etc/puppet/modules/base/lib/facter/initsystem.rb
Info: Loading facts in /etc/puppet/modules/base/lib/facter/lldp.rb
Info: Loading facts in /etc/puppet/modules/apt/lib/facter/apt.rb
Info: Loading facts in /etc/puppet/modules/stdlib/lib/facter/pe_version.rb
Info: Loading facts in /etc/puppet/modules/stdlib/lib/facter/root_home.rb
Info: Loading facts in /etc/puppet/modules/stdlib/lib/facter/puppet_vardir.rb
Info: Loading facts in /etc/puppet/modules/ganeti/lib/facter/ganeti.rb
Info: Loading facts in /etc/puppet/modules/puppet_statsd/lib/facter/puppet_config_dir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/ec2id.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb
Info: Loading facts in /var/lib/puppet/lib/facter/apt.rb
Info: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb
Info: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb
Info: Loading facts in /var/lib/puppet/lib/facter/lldp.rb
Error: Could not retrieve catalog from remote server: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: i-0000015c.eqiad.wmflabs]
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: i-0000015c.eqiad.wmflabs]
root@deployment-salt:/var/lib/git/operations/puppet (production) #

Event Timeline

yuvipanda raised the priority of this task from to Needs Triage.
yuvipanda updated the task description. (Show Details)
yuvipanda subscribed.
greg triaged this task as Unbreak Now! priority.Apr 9 2015, 5:41 PM
greg added a project: Puppet.
greg set Security to None.
hashar subscribed.

The puppet failure where due to the hostname of the puppetmaster changing. That causes puppetmaster self to no more recognize the master as being the master and alter the puppet.conf to remove the [master] section. The puppetmaster process is still around though, it ends up with the SSL cert of the client which is in [main] section.

1Notice: /Stage[main]/Puppet::Self::Config/File[/etc/puppet/puppet.conf.d/10-self.conf]/content:
2--- /etc/puppet/puppet.conf.d/10-self.conf 2015-03-13 19:55:22.551467834 +0000
3+++ /tmp/puppet-file20150409-10851-13ijz3i-0 2015-04-09 12:51:24.041629301 +0000
4@@ -3,7 +3,7 @@
5 [main]
6 logdir = /var/log/puppet
7 vardir = /var/lib/puppet
8-ssldir = /var/lib/puppet/server/ssl
9+ssldir = /var/lib/puppet/client/ssl
10 rundir = /var/run/puppet
11 factpath = $vardir/lib/facter
12
13@@ -15,23 +15,5 @@
14 postrun_command = /etc/puppet/etckeeper-commit-post
15 pluginsync = true
16 report = true
17-certname = i-0000015c.eqiad.wmflabs
18+certname = i-0000015c.deployment-prep.eqiad.wmflabs
19
20-[master]
21-bindaddress = 10.68.16.99
22-ca_md = sha1
23-certname = i-0000015c.eqiad.wmflabs
24-thin_storeconfigs = true
25-templatedir = /etc/puppet/templates
26-modulepath = /etc/puppet/private/modules:/etc/puppet/modules
27-
28-# SSL
29-ssldir = /var/lib/puppet/server/ssl/
30-ssl_client_header = SSL_CLIENT_S_DN
31-ssl_client_verify_header = SSL_CLIENT_VERIFY
32-hostcert = /var/lib/puppet/server/ssl/certs/deployment-salt.eqiad.wmflabs.pem
33-hostprivkey = /var/lib/puppet/server/ssl/private_keys/deployment-salt.eqiad.wmflabs.pem
34-
35-dbadapter = sqlite3
36-external_nodes = /usr/local/bin/ldap-yaml-enc.py
37-node_terminus = exec
accurately describe the diff that happened.

Integration had the exact same issue: T95273.

hashar claimed this task.

Ok solved! That was the exact same issue as on integration and staging project. Changing the hostname cause the puppetmaster to be reverted to a simple client and borks everything.

Fix is to reverse patch puppet.conf and specially the [master] section should have ssldir = /var/lib/puppet/server/ssl (note: /server/)

puppetmaster was deadlocked somehow. Had to kill -9 it.

All went back fine once it restarted with the proper conf.

Restricted Application added subscribers: Jay8g, TerraCodes. · View Herald Transcript