Since Jan 19th at 17:00 UTC, puppet is falling on integration instances:
[17:07:41] <shinken-wm> PROBLEM - Puppet run on integration-slave-trusty-1011 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:12:00] <shinken-wm> PROBLEM - Puppet run on integration-slave-precise-1012 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:12:42] <shinken-wm> PROBLEM - Puppet run on integration-slave-precise-1011 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:15:50] <shinken-wm> PROBLEM - Puppet run on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:16:04] <shinken-wm> PROBLEM - Puppet run on integration-slave-precise-1002 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:16:04] <shinken-wm> PROBLEM - Puppet run on integration-slave-trusty-1003 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:18:50] <shinken-wm> PROBLEM - Puppet run on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:19:45] <shinken-wm> PROBLEM - Puppet run on integration-slave-trusty-1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:24:25] <shinken-wm> PROBLEM - Puppet run on integration-slave-trusty-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
On integration-slave-jessie-1001.integration.eqiad.wmflabs.:
Notice: /Stage[main]/Role::Labs::Nfsclient/Labstore::Nfs_mount[home-on-labstoresvc]/Exec[cleanup-/home] returns: umount: /home: not mounted Error: /usr/local/sbin/nfs-mount-manager umount /home returned 32 instead of one of [0] Error: /Stage[main]/Role::Labs::Nfsclient/Labstore::Nfs_mount[home-on-labstoresvc]/Exec[cleanup-/home]/returns: change from notrun to 0 failed: /usr/local/sbin/nfs-mount-manager umount /home returned 32 instead of one of [0] Notice: /Stage[main]/Role::Labs::Nfsclient/Labstore::Nfs_mount[home-on-labstoresvc]/Mount[/home]: Dependency Exec[cleanup-/home] has failures: true
Note that https://wikitech.wikimedia.org/wiki/Hiera:Integration has:
nfs_mounts: project: false home: false scratch: false dumps: false
The puppetmaster had a stall puppet.git repo and I rebased it a bit before that started happening. I don't know which change in puppet triggered it, but it seems to me that Exec[cleanup-/home] should skip when /home is not a NFS mount?