Page MenuHomePhabricator

Puppet broken on deployment-ms-be03
Closed, DuplicatePublic

Description

PROBLEM - Puppet errors on deployment-ms-be03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]

The last Puppet run was at Thu Mar 22 12:18:13 UTC 2018 (5802 minutes ago).
maurelio@deployment-ms-be03:~$ puppet status
{
  "is_alive": true,
  "version": "4.8.2"
}

Event Timeline

You should run puppet to find out what the error is please?

I ran puppet agent -tv but I guess I did that on the wrong folder as it started to create stuff on ~/maurelio/.puppet -- I shall find the right place to run this.

maurelio@deployment-ms-be03:~$ sudo puppet agent -tv
Info: Using configured environment 'future'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, unable to init lv-a for swift at /etc/puppet/modules/swift/manifests/init_device.pp:3:9 at /etc/puppet/modules/role/manifests/swift/storage.pp:23 on node deployment-ms-be03.deployment-prep.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

@MarcoAurelio if you look at the code:

define swift::init_device($partition_nr='1') {
    if ($title !~ /^\/dev\/([hvs]d[a-z]+|md[0-9]+)$/) {                                                                                                                                       
        fail("unable to init ${title} for swift")                                                                                                                                             
    }

this means you have called swift::init_device with title lv-a, which doesn't work as this resource needs a device path.

@Joe Thanks for having a look. I however don't know what to do to fix this. Any tips? Thanks!

The title for swift::init_device comes from a hiera lookup (hiera key swift_storage_drives) . Openstack browser shows this key is set to the value 'lv-a' on deployment-ms-be03.deployment-prep.eqiad.wmflabs. I couldn't find the key in git or wikitech, so by method of elimination it has to live somewhere in horizon. That value needs to change, by the regex above it should be '/dev/sdj' or '/dev/vdb' or '/dev/md72' or anything like that. Judging from profile::swift::storage::labs the logical volume is created at /dev/vd/lv-a1 and symlinked from /dev/swift/lv-a1. So as far as I can see, we need to

  • Change the hiera in horizon to use /dev/swift/lv-a1
  • Change the regex quoted by Joe to allow /dev/swift/lv-a1 as a path

Related: T184236 and the attached patches.

I've been looking into https://horizon.wikimedia.org/project/puppet/ and apparently I cannot do anything from there but to simply see. I am also unfamiliar with Puppet and the missing docs makes the job harder. I'm giving way to more experienced people.

Actually this is a duplicate. After https://gerrit.wikimedia.org/r/#/c/361648/ the "/dev/swift/" part will be implicit as well as the trailing "1", and after https://gerrit.wikimedia.org/r/#/c/402758/ lv-[a-z] will be an permitted value for the device name. Merging both should magically fix this host as well, no hiera changes needed . Both patches are on T184236.