Page MenuHomePhabricator

grain-ensure erroneous mismatch with (bool)True vs (str)true
Closed, DeclinedPublic

Description

Our custom utility /usr/local/sbin/grain-ensure fails to recognize boolean values properly. An example chain is:

salt::grain { 'deployment_server':
    grain   => 'deployment_server',
    value   => true,
    replace => true,
}

Which makes puppet to forge an unless condition with:

/usr/local/sbin/grain-ensure contains deloyment_server true

true here is considered as being of type str. When grain-ensure look up the grain value via the salt python module, it is given a boolean true, that mismatch and thus the gran is always refreshed.

We need our script ./modules/salt/files/grain-ensure.py to normalize the values?

Example puppet run on deployment-tin.deployment-prep.eqiad.wmflabs show that all grains having a boolean values are constantly refreshed:

Debug: Exec[ensure_deployment_repo_group_wikidev](provider=posix): Executing check '/usr/local/sbin/grain-ensure contains deployment_repo_group wikidev'
Debug: Executing '/usr/local/sbin/grain-ensure contains deployment_repo_group wikidev'
Debug: Exec[ensure_php_hhvm](provider=posix): Executing check '/usr/local/sbin/grain-ensure contains php hhvm'
Debug: Executing '/usr/local/sbin/grain-ensure contains php hhvm'
Debug: Exec[ensure_labsproject_deployment-prep](provider=posix): Executing check '/usr/local/sbin/grain-ensure contains labsproject deployment-prep'
Debug: Executing '/usr/local/sbin/grain-ensure contains labsproject deployment-prep'
Debug: Exec[ensure_deployment_repo_user_trebuchet](provider=posix): Executing check '/usr/local/sbin/grain-ensure contains deployment_repo_user trebuchet'
Debug: Executing '/usr/local/sbin/grain-ensure contains deployment_repo_user trebuchet'
Debug: Executing '/bin/systemctl is-active salt-minion'
Debug: Exec[ensure_trebuchet_master_deployment-mira.deployment-prep.eqiad.wmflabs](provider=posix): Executing check '/usr/local/sbin/grain-ensure contains trebuchet_master deployment-mira.deployment-prep.eqiad.wmflabs'
Debug: Executing '/usr/local/sbin/grain-ensure contains trebuchet_master deployment-mira.deployment-prep.eqiad.wmflabs'
Debug: Exec[ensure_deployment_server_true](provider=posix): Executing check '/usr/local/sbin/grain-ensure contains deployment_server true'
Debug: Executing '/usr/local/sbin/grain-ensure contains deployment_server true'
Debug: Exec[ensure_deployment_server_true](provider=posix): Executing '/usr/local/sbin/grain-ensure set deployment_server true'
Debug: Executing '/usr/local/sbin/grain-ensure set deployment_server true'

Event Timeline

Looks like the main reason we have grain-ensure.py is to execute salt commands without a master (file_config = local). Nowadays the command supports --local which prevent it from reaching out to the salt master.

Demo
Stop the salt master and the command stall:

# salt-call -l debug grains.get deployment_server
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Configuration file path: /etc/salt/minion
[DEBUG   ] Reading configuration from /etc/salt/minion

^C

Pass --local and it works fine:

# salt-call --local -l debug grains.get deployment_server
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Configuration file path: /etc/salt/minion
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Mako not available
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] Reading configuration from /etc/salt/minion
local:
    True

Looked again at this one, the root cause is salt grains.set uses yaml to save the grain value and that is processed via YAML. Hence the string true becomes a boolean true in YAML. Later on grain-ensure tries to compare a string with a boolean which is a mismatch.

Change 348928 had a related patch set uploaded (by Hashar):
[operations/puppet@production] salt: fix grain-ensure comparison

https://gerrit.wikimedia.org/r/348928

https://gerrit.wikimedia.org/r/#/c/348928/ changes grain-ensure so it normalizes the value given to the command line the same way salt does. I have cherry picked it on the beta puppet master and that seems to have fixed it on deployment-tin.

MoritzMuehlenhoff subscribed.

Salt is being removed.

Change 348928 abandoned by Hashar:
salt: fix grain-ensure comparison

Reason:
Salt is gone

https://gerrit.wikimedia.org/r/348928