Page MenuHomePhabricator

Puppet cleanup around OpenStack
Closed, ResolvedPublic

Description

Following the discussion at this morning's retro, we resolved to try to clean up and perhaps simplify the openstack puppet system we have in place.

Event Timeline

Bstorm triaged this task as Medium priority.
Bstorm created this task.

I'm leaving this more or less for @aborrero and @Andrew to edit :)

I'm unassigning myself because I'm not quite sure what this means :)

The deployment-specific profiles in the openstack setup are a mess, but otherwise I think I'm not all that dissatisfied with our puppet code. Chase rewrote all of them not very long ago, so I'm skeptical that another rewrite would help much. I'm open to airing of specific grievances though!

Quick summary without further research:

  • The first one, which is biting us from time to time is the clientpackages stuff. The problem there is that is used by every VM and also HW servers, so a refactor there should be done with extra care. I already tried (and reverted) a couple of times :-P I will try again soon.
  • the second one is hiera inconsistencies. Deployment-specific profiles using :profile::openstack:::base:: hiera keys, etc The physical path in the puppet tree in which they live (usually hieradata/*/profile/*) also caused some issues in the past.
  • lack of guidelines or rules. We (mostly me) constantly introduce inconsistencies because we don't have clear rules on how to do certain things. Small and quick example of this are stuff like templates living in modules/openstack/templates/$openstack/$component vs puppet classes living in modules/openstack/manifests/$component/$openstack The difference in the hierarchy is inconsistent and leads to [human] errors.
  • there are several files and templates namespaced to a concrete openstack version (modules/{templates,files}/mitaka/*), but they probably don't require this. Bootstrapping another openstack release means duplicating a lot of LOCs.
  • we can't reimage some server roles in a single/double puppet agent run. This has probably improved recently, but last time I checked we required like 3 or 4 puppet agents runs before puppet could set up everything. Several missing relationships, etc.

The puppet code is working, but given how important the code is, I believe we should address all these issues (even if they are small).

Change 506135 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: clientpackages: split profile for VM instances

https://gerrit.wikimedia.org/r/506135

Change 506135 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: clientpackages: split profile for VM instances

https://gerrit.wikimedia.org/r/506135

Change 506146 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: clientpackages: vms: fix assert for realm

https://gerrit.wikimedia.org/r/506146

Mentioned in SAL (#wikimedia-cloud) [2019-04-24T12:54:29Z] <arturo> T220051 puppet broken in every VM in Cloud VPS, fixing right now

Change 506146 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: clientpackages: vms: fix assert for realm

https://gerrit.wikimedia.org/r/506146

Change 506952 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: clientpackages: decouple configuration

https://gerrit.wikimedia.org/r/506952

Change 506952 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: clientpackages: decouple configuration

https://gerrit.wikimedia.org/r/506952

Change 514445 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: remove unused common class

https://gerrit.wikimedia.org/r/514445

Change 514445 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: remove unused common class

https://gerrit.wikimedia.org/r/514445

Change 530332 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: cleanup nova-network version of nova

https://gerrit.wikimedia.org/r/530332

Change 530332 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: cleanup nova-network version of nova

https://gerrit.wikimedia.org/r/530332

I'm not actively working on this right now. Un-claiming it for now.

aborrero claimed this task.