Page MenuHomePhabricator

Puppet fails on Beta Cluster because Safe_service_restart is expecting lvs_services set though `has_lvs: false`
Closed, ResolvedPublic

Description

jforrester@deployment-parsoid11:~$ sudo puppet agent --test --noop
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Conftool::Scripts::Safe_service_restart[php7.2-fpm]:
  has no parameter named 'lvs_services'
  expects a value for parameter 'services' (file: /etc/puppet/modules/profile/manifests/mediawiki/php/restarts.pp, line: 38) on node deployment-parsoid11.deployment-prep.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

Event Timeline

Krenair added a subscriber: Krenair.Mar 7 2020, 4:17 AM

reading the error more closely it seems to be that Safe_service_restart is *not* expecting lvs_services - it's expecting services.

Change 577725 had a related patch set uploaded (by Alex Monk; owner: Alex Monk):
[operations/puppet@production] Fix incorrect name of safe_service_restart parameter

https://gerrit.wikimedia.org/r/577725

cherry-picked the above, your instance now has its puppet catalog being successfully compiled. I've left a copy of the output of the first two runs at /root/first-successful-puppet-run-20200307 and /root/second-successful-puppet-run-20200307 in case you were interested in what it did.

Remaining error in there is about failing to start php7.2-fpm - it seems to be trying to allocate 4GB of RAM, in an instance that has that much total. Sounds like we need to tweak profile::mediawiki::apc_shm_size - or replace it with a bigger instance.

Change 577725 merged by Dzahn:
[operations/puppet@production] Fix incorrect name of safe_service_restart parameter

https://gerrit.wikimedia.org/r/577725

Dzahn added a comment.Mar 10 2020, 8:16 PM

Remaining error in there is about failing to start php7.2-fpm - it seems to be trying to allocate 4GB of RAM, in an instance that has that much total. Sounds like we need to tweak profile::mediawiki::apc_shm_size - or replace it with a bigger instance.

Let's just set it to 128M first and see what happens. It's the default in a bunch of places, f.e.

hieradata/regex.yaml: profile::mediawiki::apc_shm_size: 128M

and

hieradata/cloud/eqiad1/deployment-prep/common.yaml:profile::mediawiki::apc_shm_size: 128M

and others

Krenair closed this task as Resolved.Mar 10 2020, 9:04 PM
Krenair claimed this task.

@Jdforrester-WMF See above for pointers on fixing the remaining thing, this task as named is fixed now. Obviously feel free to ask for help in another task, on IRC or wherever if you have problems with the rest