Page MenuHomePhabricator

[Cloud VPS alert][cloudinfra-codfw1dev] Puppet failure on (
Closed, ResolvedPublic


From email:

Date: Thu, 25 Nov 2021 08:15:05 +0000
From: root <>
Subject: [Cloud VPS alert][cloudinfra-codfw1dev] Puppet failure on (

Puppet is having issues on the " (" instance in project
cloudinfra-codfw1dev in Wikimedia Cloud VPS.

Puppet is running with failures.

Working Puppet runs are needed to maintain instance security and logins.
As long as Puppet continues to fail, this system is in danger of becoming

You are receiving this email because you are listed as member for the
project that contains this instance.  Please take steps to repair
this instance or contact a Cloud VPS admin for assistance.

If your host is expected to fail puppet runs and you want to disable this
alert, you can create a file under /.no-puppet-checks, that will skip the checks.

You might find some help here:

For further support, visit #wikimedia-cloud on or

Some extra info follows:
---- Last run summary:
  total: 1
  failure: 1
  success: 1
  total: 2
  changed: 1
  corrective_change: 1
  failed: 1
  failed_to_restart: 0
  out_of_sync: 2
  restarted: 0
  scheduled: 0
  skipped: 0
  total: 443
  augeas: 0.008286048
  catalog_application: 5.071433247998357
  config_retrieval: 2.4902771930210292
  convert_catalog: 0.3685375089989975
  exec: 0.12376264499999999
  fact_generation: 0.32436568301636726
  file: 2.551211253
  file_line: 0.002842567
  filebucket: 4.852e-05
  group: 0.000927392
  last_run: 1637826820
  node_retrieval: 0.2779838900314644
  notify: 0.004758526
  package: 0.559228629
  plugin_sync: 0.6879859289620072
  schedule: 0.000465333
  service: 0.9200921579999999
  tidy: 0.000304051
  total: 9.284562957
  transaction_evaluation: 4.9448587680235505
  user: 0.00102175
  config: '(1b14071c2e) Manuel Arostegui - db1128: Move it to test-s1'
  puppet: 5.5.22

---- Failed resources if any:

  * Service[systemd-timesyncd]

---- Exceptions that happened when running the script if any:
  No exceptions happened.

This has been failing for a while.

Event Timeline

dcaro triaged this task as High priority.Thu, Nov 25, 9:22 AM
dcaro created this task.
dcaro moved this task from To refine to Today on the User-dcaro board.
dcaro changed the task status from Open to In Progress.Thu, Nov 25, 9:40 AM
dcaro moved this task from Today to Doing on the User-dcaro board.

Change 741849 had a related patch set uploaded (by David Caro; author: David Caro):

[operations/puppet@production] timesyncd: add package requirement

These machines are not supposed to have timesyncd present (as they are running ntpd to act as a ntp server). I see the hiera key

profile::systemd::timesyncd::ensure: absent

in Horizon. Is that not working for some reason?

So, that host uses the project internal puppetmaster:

root@ntp-1:~# puppet config --section agent print server
2021-11-25 10:32:03.546464 WARN  puppetlabs.facter - locale environment variables were bad; continuing with LANG=C LC_ALL=C

That server's enc seems not to report that value:

root@cloudinfra-internal-puppetmaster-01:~# puppet config --section master print external_nodes

root@cloudinfra-internal-puppetmaster-01:~# /usr/local/bin/puppet-enc
classes: ['role::wmcs::services::ntp']
parameters: {}

Just noticed it's using codfw1dev, so labtest, xd

Change 741849 merged by David Caro:

[operations/puppet@production] timesyncd: handle bullseye ntp hosts

Change 742107 had a related patch set uploaded (by David Caro; author: David Caro):

[operations/puppet@production] timsyncd: Flip the handling service condition

Change 742107 merged by David Caro:

[operations/puppet@production] timsyncd: Flip the handling service condition

dcaro moved this task from Doing to Done on the User-dcaro board.