Page MenuHomePhabricator

Document all uses of the puppetCA certificate
Open, MediumPublic

Description

As we are due to update the puppet CA we will need to know all locations where the certificate is currently used. The following is a good list of places to start looking but may not be complete

  • keys created using cergen
  • users of base::expose_puppet_certs
  • the few users we have that are referencing either puppet_ssldir() or manually hardcoding the /var/lib/puppet/ssl directory
  • Any helm charts that need our CA cert: eventgate

The following services make reference to Puppet_Internal_CA.crt

Details

Related Gerrit Patches:
operations/deployment-charts : mastereventgate-logging values: update puppet CA
operations/puppet : productionprofile::base: reorder class

Event Timeline

jbond triaged this task as Medium priority.Nov 4 2019, 2:36 PM
jbond created this task.

base::expose_puppet_certs - This can be ignored as it relates to the client key pairs and not the CA public certificate which is the file which is changing. Further i think keys created using cergen can also be largley ignored. We dont much care for certs that have been generated we just need to ensure that anything validating theses certificates also has the new public CA i.e references to {/usr/local/share/ca-certificates/,/etc/ssl/certs/}Puppet_Internal_CA.crt

jbond updated the task description. (Show Details)Nov 4 2019, 3:58 PM
Krenair added a subscriber: Krenair.Nov 4 2019, 6:30 PM

Acme-chief nginx config probably?

Joe added a comment.Nov 5 2019, 9:49 AM

As far as etcd is concerned, a rolling restart should be enough to ensure the new CA is picked up. I will take care of that.

jbond added a comment.Nov 5 2019, 9:59 AM

As far as etcd is concerned, a rolling restart should be enough to ensure the new CA is picked up. I will take care of that.

Thanks, ill ping T237362 when the the CR is merged

jbond updated the task description. (Show Details)Nov 5 2019, 9:59 AM

Change 548706 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] profile::base: reorder class

https://gerrit.wikimedia.org/r/548706

jcrespo moved this task from Triage to Backlog on the DBA board.Nov 5 2019, 3:55 PM
jbond moved this task from Unsorted 💣 to Active 🚁 on the User-jbond board.Nov 6 2019, 3:06 PM
crusnov added a subscriber: crusnov.Nov 6 2019, 5:27 PM
jbond added a subscriber: Eevans.Nov 6 2019, 5:31 PM

@Eevans moritz mentioned there maybe some cassandra consideration to take into account and you could enlighten me as to what they are :)

CDanis updated the task description. (Show Details)Nov 6 2019, 6:09 PM
CDanis updated the task description. (Show Details)
Joe added a comment.Nov 7 2019, 9:48 AM

@Eevans moritz mentioned there maybe some cassandra consideration to take into account and you could enlighten me as to what they are :)

Cassandra uses its own CA, which is per-installation. It has no relationship with the main CA we use for puppet. One could say we should maybe check the expiration of those CAs as well, but I think it's already covered by monitoring.

Change 548706 merged by Jbond:
[operations/puppet@production] profile::base: reorder class

https://gerrit.wikimedia.org/r/548706

Joe added a comment.Nov 7 2019, 10:16 AM

The calico/node service, and the kube-controller-manager service will need to be restarted on the kubernetes workers and masters respectively.

Ottomata updated the task description. (Show Details)Mon, Nov 18, 7:04 PM
Ottomata updated the task description. (Show Details)
Ottomata updated the task description. (Show Details)Mon, Nov 18, 7:06 PM
Ottomata updated the task description. (Show Details)
jbond updated the task description. (Show Details)Wed, Dec 4, 6:04 PM

Change 554584 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/deployment-charts@master] eventgate-logging values: update puppet CA

https://gerrit.wikimedia.org/r/554584

Change 554584 merged by Ottomata:
[operations/deployment-charts@master] eventgate-logging values: update puppet CA

https://gerrit.wikimedia.org/r/554584

jbond updated the task description. (Show Details)Tue, Dec 10, 12:13 PM
jbond updated the task description. (Show Details)Tue, Dec 10, 12:16 PM
jbond updated the task description. (Show Details)Tue, Dec 10, 12:35 PM
jbond updated the task description. (Show Details)Tue, Dec 10, 12:50 PM
jbond updated the task description. (Show Details)Tue, Dec 10, 12:56 PM
jbond updated the task description. (Show Details)
jbond removed a project: DBA.
jbond updated the task description. (Show Details)EditedTue, Dec 10, 1:28 PM
  • 'trafficserver' restarted on all cache-ats nodes
  • puppetdb and postgresql restarted on puppetdb[12]002
  • postgresql restarted on netbox[12]001
  • stunnel4 service restarted on all service using rsync warped with stunnel
  • rsyslog restarted every where that has kafka shipping
  • uwsgi-netbox restarted on netbox servers other references are from systemd timers and no action needed
  • varnishkafka-webrequest and varnishkafka-eventlogging restarted on effected servers
  • no restart required
    • mysqld_exporter_config.py is ran by puppet
    • backup_mariadb.py ran by cron
    • check_mariadb.py ran by icinga
    • debmonitor::client either run as an apt hook or via cron
jbond updated the task description. (Show Details)Tue, Dec 10, 2:11 PM
jbond added a comment.Tue, Dec 10, 2:38 PM

I have now updated the relevant CA files in the private repo