Page MenuHomePhabricator

Improve early detection of hiera data type errors
Open, LowPublic

Description

At present we do not detect hiera yaml syntax errors or data type errors in CI or prior to patch submission. This gap has been a long source of frustration for users. Recently @brouberol ran into this issue and proposed this patch, which enables running our rspec compile test against our hiera data. The purpose of this task is to decide on our short term and long term approach to closing this gap and reducing this source of errors.

Possible routes:

  1. Run compile tests against our hiera data, patch
    1. Ensures our hiera data is consumable by our puppet code
    2. Ties our module specs to data outside of the module
  2. Run compile tests against in module data, i.e. move our data into the module, patch
    1. Ensures our hiera data is consumable by our puppet code
    2. Places our specific data in modules, making them less generic
  3. Generate JSON schemas based on our Puppet types and validate our hiera data using those schemas, patch
    1. Ensures our hiera data is consumable by our puppet code
    2. Allows us to keep our specific hiera data out of our modules
    3. At present creating the JSON schema is done manually, but it appears possible to generate the schemas programatically from the Puppet types

Related Objects

Event Timeline

Change 979469 had a related patch set uploaded (by JHathaway; author: JHathaway):

[operations/puppet@production] apt_repo: validate preseed data with a JSON Schema

https://gerrit.wikimedia.org/r/979469

Change 979470 had a related patch set uploaded (by JHathaway; author: Brouberol):

[operations/puppet@production] apt_repo: move hiera data into module, to allow for validation

https://gerrit.wikimedia.org/r/979470

Thank you Jesse for putting this all together based on my patch!

I think the 3rd route will bring a non-trivial amount of complexity: we'll need to keep the JSON schema and puppet types in sync, which we might forget to do, thus adding friction to the development experience. Also, the JSON schema and puppet types would play a similar role there, and as there are ways to catch these errors with puppet types only (as demonstrated by the routes 1 and 2), I'd rather we focus on picking one of these two.

I'm way less versed in Puppet than you, meaning I don't have a particular inclination between these.

Thanks again!

Change 979889 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/puppet@production] Test breaking YAML syntax to see if it passes CI

https://gerrit.wikimedia.org/r/979889

A thought: what currently happens if we introduce a single change to a hiera YAML file turning it syntactically invalid? To test this, I opened https://gerrit.wikimedia.org/r/c/operations/puppet/+/979889 and found out that we do have yaml parsing checks in place 👍

Change 979889 abandoned by Brouberol:

[operations/puppet@production] Test breaking YAML syntax to see if it passes CI

Reason:

This was just an experiment

https://gerrit.wikimedia.org/r/979889

A thought: what currently happens if we introduce a single change to a hiera YAML file turning it syntactically invalid? To test this, I opened https://gerrit.wikimedia.org/r/c/operations/puppet/+/979889 and found out that we do have yaml parsing checks in place 👍

ah, I didn't realize that! That is good at least!

jhathaway renamed this task from Improve early detection of hiera yaml syntax or data type errors to Improve early detection of hiera data type errors.Dec 4 2023, 2:54 PM
jhathaway updated the task description. (Show Details)

Change 979469 abandoned by JHathaway:

[operations/puppet@production] apt_repo: validate preseed data with a JSON Schema

Reason:

proof of concept

https://gerrit.wikimedia.org/r/979469

@brouberol after feedback from @jbond, my inclination is to move the data into the module, i.e. option (2)

Change 979470 merged by JHathaway:

[operations/puppet@production] apt_repo: move hiera data into module, to allow for validation

https://gerrit.wikimedia.org/r/979470

jhathaway triaged this task as Low priority.