Page MenuHomePhabricator

Replace tools-puppetmaster-01 (jessie) with a buster puppetmaster
Closed, ResolvedPublic

Description

Will be required for use with the new puppetdb instance, and will be needed anyway given T236565: "tools" Cloud VPS project jessie deprecation

Event Timeline

Created tools-puppetmaster-02 in the puppetmaster security group and set up with role::puppetmaster::standalone (ensuring autosigning is off) - will need to copy files (CA, operations/puppet.git cherry-picks, labs/private.git cherry-picks, /root and /home) across before changing any hieradata to use the new instance.

Unmounted NFS stuff, copied /root (to /root/tools-puppetmaster-01-root.tgz), /home (to /root/tools-puppetmaster-01-homes.tgz), labs/private.git cherry-picks, and CA (and deleted the temporary local copy from my laptop). Turns out it has no operations/puppet.git cherry-picks (yay).
Edit: And did some faffing around to get the new master a cert of its own signed by the CA properly
Anything else that needs to be done before picking a guinea pig instance to try moving to it?

That sounds like everything I can think of. tools-package-builder-01 sounds like a great guinea pig instance to me. Thanks!!

I tried it on tools-package-builder-02 and nothing changed. How do we want to roll this out, just update project hieradata and run puppet a couple of times everywhere? Choose a few instance name prefixe and just move those over to start with?

Since this is set on the project overall, it's probably just as well to do it that way. However, we should test one gridengine node first. Puppet may fail because it is still using the old hba mechanism, which clashes with the other way of collecting host keys from production. Maybe let's check a grid node and just put it on project puppet.

Mentioned in SAL (#wikimedia-cloud) [2020-02-19T15:36:52Z] <bstorm_> setting 'puppetmaster: tools-puppetmaster-02.tools.eqiad.wmflabs' on tools-sgeexec-0942 to test new puppetmaster on grid T245365

Oh! It doesn't seem to have storeconfigs or puppetdb enabled yet. In that case, yeah, let's just enable it across the project :)

Mentioned in SAL (#wikimedia-cloud) [2020-02-19T22:05:02Z] <Krenair> Project-wide hiera change to swap puppetmaster to tools-puppetmaster-02 T245365

Mentioned in SAL (#wikimedia-cloud) [2020-02-20T00:04:40Z] <Krenair> Shut off tools-puppetmaster-01 - to be deleted in one week T245365

This happened uneventfully as far as I can tell - only thing I saw change (I supervised it on a few random hosts) I traced back to a difference in how stretch/buster's build of ruby serialises YAML - somewhere between ruby 2.1.5p273 (2014-11-13) [x86_64-linux-gnu] and ruby 2.3.3p222 (2016-11-21) [x86_64-linux-gnu] it started adding quotes around some strings.
Also cleaned up some other random things I found laying around - puppet being disabled on -prometheus-04, old role::puppet::self configs on -elastic-0[34], and the old broken root ssh userkeys directory thing also on -elastic-0[34].

Also cleaned up some other random things I found laying around - [...] and the old broken root ssh userkeys directory thing also on -elastic-0[34].

Also tidied this up on other random hosts I found it on - mostly tools but some other random Cloud VPS things.

Also updated wikitech docs for 3 of the puppetmaster migrations I've done recently including this one.

Will leave this open until we've deleted the old tools-puppetmaster on Wednesday/Thursday.

bd808 triaged this task as High priority.Feb 25 2020, 5:06 PM
bd808 moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.