|Resolved||jbond||T184561 Modernize Puppet Configuration Management (2017-18 Q3 Goal)|
|Resolved||jbond||T184564 Plan Puppet 5 upgrade|
|Resolved||jbond||T219803 upgrade facter and puppet across the fleet|
|Declined||jbond||T221343 puppet fails to run in cp1008 under certain conditions|
|Open||None||T222160 facter3: use structured facts|
|Resolved||jbond||T222326 cronspam from smart-data-dump due to facter bug|
|Resolved||jbond||T222356 facter3: Unable to parse routing table|
|Resolved||jbond||T223938 facter 3: add timeout to custom facts external calls|
|Resolved||jbond||T227587 upgrade puppet master servers|
|Resolved||jbond||T227779 Hiera incompatible with newer versions of puppet|
An upgrade to Puppet 5 on paper is very similar in process/approach to our Puppet 4 upgrade where we upgraded Puppet masters first, then PuppetDB. The plan looks like:
- Address deprecation warnings from Puppet 4 masters. This is already ongoing but will be a big help to have done before formally beginning Puppet 5 upgrade
- Testing/prep work performed outside production (lab instances, etc.) to hone upgrade process for Puppet agents, masters and compilers
- Puppet 5 package for supported OSes are backported/built/uploaded
- Puppet master “Puppetization” is updated to support Puppet 5 masters that happily co-exist with Puppet 4 masters
- Puppet compiler “Puppetization” is updated to support Puppet 5 compilers that happily co-exist with Puppet 4 compilers
- A single production Puppet master (e.g. rhodium) is depooled and upgraded (via Puppet)
- Catalog compilation/diff loop begins. This is expected to be a large portion of the work as we have many site specific customizations to ensure are working correctly under the new version
- Mass catalog compilation and diff is performed between v4 and v5 masters (server elnath is pre-configured to perform this type of catalog diffing)
- Issues are identified and fixed (non backwards compatible patches applied locally only if necessary. backwards compatibility is strongly preferred)
- refresh facts, refresh host list, and repeat until all catalogs compile and any changes are understood/explained (during the previous upgrade there were minor changes applied e.g. randomized cron times)
- One Puppet compiler (e.g. compiler03) is de-pooled in jenkins and upgraded to Puppet 5
- One physical site (e.g. codfw) is de-pooled (via dns) and upgraded to Puppet 5
- Non backwards compatible code changes (if any) are applied locally to upgraded masters/compiler
- Hosts are cut over (via Puppet) to the upgraded site canary style (small number from each role)
- Puppet service is failed over (de-pool live site, re-pool upgraded site) for production cut-over to Puppet 5 master
- The upgraded compiler (compiler03) is re-pooled, and other compiler (compiler02) de-pooled in jenkins
- Non backwards compatible code changes (if any) are merged
- Ensure that the Puppet 5 masters have been stable for at least 24 hours before proceeding (to ensure rollback path to Puppet 4 masters).
- Upgrade remaining Puppet 4 masters and compiler to Puppet 5 and re-pool
- Upgrade Puppet agents in groups according to site and OS version
- Puppet 5 - A Puppet 5.4 package exists for Buster. In time Puppet 5 should become available in stretch-backports at which point we could work on a backport for jessie (and trusty?). This is similar to the approach we took when upgrading to Puppet 4 with the addition of backporting to jessie ourselves.
- PuppetDB 5 - The Debian PuppetDB package is still experimental and it’s unclear if/when this will change (also this package is version 4.4). Puppetlabs has a PuppetDB 5.2 package for Stretch and, although the Puppetlabs packages leave a lot to be desired, we have successfully used (slightly customized) Puppetlabs packages for PuppetDB v2 and v4. So this should be a viable option for us when the time comes to upgrade PuppetDB as well.
- Facter - A Facter upgrade is not required for Puppet 5. Facter 2 is supported.
- Hiera - The Puppet docs list hiera >= 3.2.1 as a requirement for Puppet 5. However, basic testing of Puppet 5 with the available Debian version 3.2.0 works (fwiw 3.2.0 is the version in buster and sid as well). This seems like an oddly specific version requirement, but including it just in case it bites us.
Note: this does not consider support for servermon as I understand it will be retired in the near future in favor of puppetboard.
In terms of code, what would the changes required be? What are these deprecation warnings that you mentioned above? Are we tracking fixes for these somewhere and are we making sure new ones don't crop up?
I don't have a list of code changes offhand. Going through the process of compiling/diffing all hosts against a puppet 5 master is a really good indicator of what will break and require changes. I've built a puppet 5 compiler in labs today and will try to gather some more specifics.
What are these deprecation warnings that you mentioned above?
Details are logged in the Puppet compiler error output and Puppet master logs. Here's a list of the warnings P6943
Are we tracking fixes for these somewhere and are we making sure new ones don't crop up?
We have PuppetSyntax.fail_on_deprecation_notices = false currently. It looks like work was done in T154915 to enable this but we should re-evaluate since we've upgraded to Puppet 4. I think we would want to knock down existing warnings before enabling to avoid making a habit of ignoring Jenkins.