Page MenuHomePhabricator

Set up and use exported resources for Tool Labs's shared knowledge
Closed, ResolvedPublic

Description

The Labs puppetmaster did not and does not support exported resources, and it is technically impossible to do so in a multi-project environment. This led to the current setup where for example each host writes its ssh host key into a file on NFS, grid hosts identify themselves in the same way, etc.

However, now all Toolforge instances are served by a project-specific puppetmaster. If (AFAIUI) PuppetDB is set up on this puppetmaster, exported resources should work and could be used for example for:

  • Sharing ssh host keys natively,
  • using simple puppetry to identify submit hosts, execution nodes, etc., and
  • sharing credentials for Kubernetes, i. e. no cherry-picks on labs/private necessary.

At the moment, role::puppetmaster::standalone does not seem to have an easy option to enable PuppetDB, so this needs to be done first.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
scfc triaged this task as Low priority.Dec 18 2016, 12:24 AM
scfc added a parent task: T152866: Make clush safer.
scfc moved this task from Backlog to Ready to be worked on on the Toolforge board.

Change 329382 had a related patch set uploaded (by Tim Landscheidt):
Tools: Use exported resources for ssh host keys

https://gerrit.wikimedia.org/r/329382

This is what I've got in deployment-prep:

Puppet patch
commit d5bb8d6e1228b565bf92c810378e2bc6b2c1fc9a
Author: Alex Monk <krenair@deployment-puppetmaster02.deployment-prep.eqiad.wmflabs>
Date:   Sun Jan 22 00:16:00 2017 +0000

    puppetdb for deployment-prep

diff --git a/modules/role/files/puppetdb/tuning.conf b/modules/role/files/puppetdb/tuning.conf
index 8c8431c..ee48f6e 100644
--- a/modules/role/files/puppetdb/tuning.conf
+++ b/modules/role/files/puppetdb/tuning.conf
@@ -3,5 +3,5 @@ checkpoint_completion_target = 0.9
 effective_cache_size = 8GB
 work_mem = 192MB
 wal_buffers = 8MB
-shared_buffers = 7680MB
+shared_buffers = 768MB
 max_connections = 120
diff --git a/modules/role/manifests/puppetmaster/standalone.pp b/modules/role/manifests/puppetmaster/standalone.pp
index 7cab601..cb39e1b 100644
--- a/modules/role/manifests/puppetmaster/standalone.pp
+++ b/modules/role/manifests/puppetmaster/standalone.pp
@@ -68,6 +68,12 @@ class role::puppetmaster::standalone(
         group  => 'root',
     }
 
+    class { '::role::puppetmaster::common':
+        base_config => {
+            ca        => true,
+            ca_server => 'deployment-puppetmaster02.deployment-prep.eqiad.wmflabs',
+        }
+    }
     $config = {
         'node_terminus'     => 'exec',
         'external_nodes'    => '/usr/local/bin/puppet-enc',
@@ -82,7 +88,7 @@ class role::puppetmaster::standalone(
         include_conftool    => false,
         prevent_cherrypicks => $prevent_cherrypicks,
         extra_auth_rules    => $extra_auth_rules,
-        config              => $config,
+        config              => merge($config, $::role::puppetmaster::common::config),
     }
 
     # Update git checkout
diff --git a/modules/ssh/manifests/client.pp b/modules/ssh/manifests/client.pp
index 8cf8808..ec742f8 100644
--- a/modules/ssh/manifests/client.pp
+++ b/modules/ssh/manifests/client.pp
@@ -4,7 +4,7 @@ class ssh::client {
     }
 
     # no exported resources on Labs == no sshknowngen
-    if $::realm == 'production' {
+    if $::realm == 'production' or $::labsproject == 'deployment-prep' {
         if $::use_puppetdb {
             file { '/etc/ssh/ssh_known_hosts':
                 content => template('ssh/known_hosts.erb'),
diff --git a/modules/ssh/manifests/server.pp b/modules/ssh/manifests/server.pp
index 5d95f9c..20e1a8e 100644
--- a/modules/ssh/manifests/server.pp
+++ b/modules/ssh/manifests/server.pp
@@ -72,11 +72,17 @@ class ssh::server (
         err("No valid SSH host key found for ${::fqdn}")
     }
 
+    if $::ipaddress6 != undef {
+        $aliases = [ $::hostname, $::ipaddress, $::ipaddress6 ]
+    } else {
+        $aliases = [ $::hostname, $::ipaddress ]
+    }
+
     debug("Storing ${type} SSH hostkey for ${::fqdn}")
     @@sshkey { $::fqdn:
         ensure       => present,
         type         => $type,
         key          => $key,
-        host_aliases => [ $::hostname, $::ipaddress, $::ipaddress6 ],
+        host_aliases => $aliases,
     }
 }

As well as the hiera data at the bottom of Hiera:Deployment-prep on wikitech.

Obviously, this needs cleaning up for use elsewhere.

I prefer your approach to ssh::client. Let's split that and get it reviewed first?
I have ssh::server handling whether to add an IPv6 address as an alias or not, whereas you figure it out in the template. I prefer my solution. (Ultimately I think everything still works if we don't change this, but...)
We need to figure out how to handle role::puppetmaster::standalone properly.
You seem to lack a change to your PuppetDB PostgreSQL tuning.conf - I don't think we want to be creating large instances for this. We obviously need to do this properly with a template + hiera.

Change 333471 had a related patch set uploaded (by Alex Monk):
Allow use of PuppetDB in labs for sshknowngen

https://gerrit.wikimedia.org/r/333471

Change 333472 had a related patch set uploaded (by Alex Monk):
ssh: Don't add IPv6 address as an alias in exported resource if it's undefined

https://gerrit.wikimedia.org/r/333472

Change 333473 had a related patch set uploaded (by Alex Monk):
puppetdb: Allow tuning.conf to have a different shared_buffers value

https://gerrit.wikimedia.org/r/333473

With those three patches, we're left with the role::puppetmaster::standalone thing

Change 333473 merged by Alexandros Kosiaris:
puppetdb: Allow tuning.conf to have a different shared_buffers value

https://gerrit.wikimedia.org/r/333473

Change 333472 merged by Filippo Giunchedi:
ssh: Don't add IPv6 address as an alias in exported resource if it's undefined

https://gerrit.wikimedia.org/r/333472

Change 333471 merged by Alexandros Kosiaris:
[operations/puppet@production] Allow use of PuppetDB in labs for ssh_known_hosts

https://gerrit.wikimedia.org/r/333471

Is this done or has tools' puppetmaster not been set up for puppetdb stuff?

It would appear that this is not set up on the tools puppetmaster at this time Jun 27 18:01:06 tools-puppetmaster-01 puppet-master[19573]: You cannot collect exported resources without storeconfigs being set; the export is ignored at /etc/puppet/modules/monitoring/

puppetdb-terminus appears to be installed. I don't see it configured, though.

This is now enabled on toolsbeta, btw.

How's it going there? Are we still interested in using it in the tools project?

This works in toolsbeta now. It handles ssh knownhosts using puppetdb. We just never deployed the new puppetdb and puppetmaster fully. It was started and then abandoned. I'd love to have this in tools for moving more resources away from NFS dependency, but there hasn't been time.

I figured I'd clean up puppet problems in the toolsbeta project before going near tools proper.

In particular, toolsbeta-puppetdb-01 itself had a puppet error. I set profile::puppetdb::filter_job_id: true and profile::puppetdb::microservice::enabled: False (required hiera keys for puppetdb hosts now), ran into Nrpe::Monitor_service[uwsgi-puppetdb-microservice]: expects a value for parameter 'notes_url' at /etc/puppet/modules/uwsgi/manifests/app.pp:77 and cherry-picked https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/529590/ to deal with that. Now we've just got ferm complaining about AAAA records (T153468: Ferm's upstream Net::DNS Perl library questionable handling of NOERROR responses without records causing puppet errors when we try to @resolve AAAA in labs, sigh), and this stuff that doesn't look relevant in labs:

Error: Could not disable prometheus-node-exporter-ipmitool-sensor.timer: 
Error: /Stage[main]/Prometheus::Node_exporter/Service[prometheus-node-exporter-ipmitool-sensor.timer]/enable: change from false to mask failed: Could not disable prometheus-node-exporter-ipmitool-sensor.timer: 
Error: Could not disable prometheus-node-exporter-smartmon.timer: 
Error: /Stage[main]/Prometheus::Node_exporter/Service[prometheus-node-exporter-smartmon.timer]/enable: change from false to mask failed: Could not disable prometheus-node-exporter-smartmon.timer:

(on a complete tangent I also re-made certs for the toolsbeta-paws* hosts as those seemed completely broken, no puppet runs since April)

Sorted puppet out on toolsbeta-puppetdb-01 (puppetmaster::servers entry had no loadfactor).

I've created tools-puppetdb-01 (buster) in the tools project. I see the tools-puppetmaster is jessie so we may need to replace it, we shouldn't make new jessie hosts :)

Based on T243226#5843560 (can't use a stretch puppetmaster with a buster puppetdb) our jessie puppetmaster in tools will be useless, we'll need a new (buster) puppetmaster, and we'll need one soon anyway given the upcoming removal of jessie support in general, so I guess I'll make a task for that.

taavi subscribed.

(resetting assignee based on subtask)

Change 779051 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:toolforge: use puppetdb for grid hba data

https://gerrit.wikimedia.org/r/779051

Change 779051 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] P:toolforge: use puppetdb for grid hba data

https://gerrit.wikimedia.org/r/779051

taavi claimed this task.