We are getting a lot of cron spam from the "smart-data-dump" cron job.
The script is ./modules/smart/files/smart-data-dump and it has been added in T86552.
Currently and at first, only cp* hosts were affected.
The root cause is a bug in facter which the script calls in:
raw_output = subprocess.check_output(['/usr/bin/facter', '--puppet', '--json', fact_name])
When running facter with -d to get debug output it can be see that it runs ip show route and tries to parse the output:
DEBUG leatherman.execution:93 - executing command: /sbin/ip route show which then throws a lot of warnings like:
[cp3041:~] $ sudo facter --puppet --json raid | grep WARN WARN puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:107:10:64:48:101 via fe80::1 dev eno1 metric 1024 mtu lock 1450 pref medium'
This is very similar, but not identical, to the upstream bug https://tickets.puppetlabs.com/browse/FACT-1394
In that bug the parsing fails when there is "linkdown" in the ip route output, but that isn't the case for us.
An attempt was made to add -l error to the facter command in order to set the loglevel to error and suppress the warnings to stop the cron spam.
https://gerrit.wikimedia.org/r/c/operations/puppet/+/507634
While this worked fine on the cp* servers displaying the issue..after merging it caused even more and new cron spam on non-cp* hosts.
The reason for that were different facter versions. For some reason cp* hosts appear to have facter 3.x while most other hosts have facter 2.x, even when both are on stretch.
In facter 2.x the "-l error" option does not exist which lead to the new spam from these hosts.
So that change was reverted and now we are back to the original state.. cp* hosts are affected but others are not.
Also the cp* hosts show the same warnings on each puppet run on the console.