Page MenuHomePhabricator

No Puppet resources found on instance deployment-docker-mobileapps02 on project deployment-prep
Closed, ResolvedPublic

Description

Common information

  • summary: No Puppet resources found on instance deployment-docker-mobileapps02 on project deployment-prep
  • alertname: PuppetAgentNoResources
  • instance: deployment-docker-mobileapps02
  • job: node
  • project: deployment-prep
  • severity: warning

Firing alerts


  • summary: No Puppet resources found on instance deployment-docker-mobileapps02 on project deployment-prep
  • alertname: PuppetAgentNoResources
  • instance: deployment-docker-mobileapps02
  • job: node
  • project: deployment-prep
  • severity: warning
  • Source

Event Timeline

bd808 subscribed.
bd808@deployment-docker-mobileapps02:~$ sudo -i puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud
Error: Failed to apply catalog: No space left on device @ io_write - /var/lib/puppet/client_data/catalog/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.json20251112-2618934-4w3ub
bd808@deployment-docker-mobileapps02:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            3.9G     0  3.9G   0% /dev
tmpfs           796M   31M  765M   4% /run
/dev/sda1        20G   20G     0 100% /
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/sda15      124M   12M  113M  10% /boot/efi
tmpfs           796M     0  796M   0% /run/user/0
tmpfs           796M     0  796M   0% /run/user/24346
tmpfs           796M     0  796M   0% /run/user/3518
bd808@deployment-docker-mobileapps02:/$ sudo du -sh * | sort -h | tail -5
du: cannot access 'proc/2619410/task/2619410/fd/4': No such file or directory
du: cannot access 'proc/2619410/task/2619410/fdinfo/4': No such file or directory
du: cannot access 'proc/2619410/fd/3': No such file or directory
du: cannot access 'proc/2619410/fdinfo/3': No such file or directory
33M     run
82M     boot
2.2G    usr
4.2G    home
14G     var
bd808@deployment-docker-mobileapps02:/var$ sudo du -sh * | sort -h | tail -5
120K    spool
2.6M    backups
365M    cache
4.4G    log
8.8G    lib
bd808@deployment-docker-mobileapps02:/var/log$ sudo du -sh * | sort -h | tail -5
14M     syslog.6.gz
312M    syslog.1
836M    syslog
1.3G    account
1.9G    journal
bd808@deployment-docker-mobileapps02:/var/lib$ sudo du -sh * | sort -h | tail -5
5.3M    puppet
14M     sss
29M     dpkg
157M    apt
8.6G    docker
root@deployment-docker-mobileapps02:/var/lib/docker# du -sh * | sort -h | tail -5
52K     network
92K     buildkit
2.4M    image
1.1G    overlay2
7.5G    containers

After a number of repetitions of drilling down I got to:

root@deployment-docker-mobileapps02:/var/lib/docker/containers/9f525917a53cde2e946978d06199eddcb39c2b3a5382bd6a20d869cb9e384efe# du -sh * | sort -h | tail -5
4.0K    hosts
4.0K    mounts
4.0K    resolv.conf
4.0K    resolv.conf.hash
7.5G    9f525917a53cde2e946978d06199eddcb39c2b3a5382bd6a20d869cb9e384efe-json.log

7.5G of logs from the docker container.

Let's start by truncating the log:

root@deployment-docker-mobileapps02:/var/lib/docker/containers/9f525917a53cde2e946978d06199eddcb39c2b3a5382bd6a20d869cb9e384efe# truncate -s 0 9f525917a53cde2e946978d06199eddcb39c2b3a5382bd6a20d869cb9e384efe-json.log
root@deployment-docker-mobileapps02:/var/lib/docker/containers/9f525917a53cde2e946978d06199eddcb39c2b3a5382bd6a20d869cb9e384efe# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            3.9G     0  3.9G   0% /dev
tmpfs           796M  644K  795M   1% /run
/dev/sda1        20G   13G  6.7G  65% /
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/sda15      124M   12M  113M  10% /boot/efi
tmpfs           796M     0  796M   0% /run/user/0
tmpfs           796M     0  796M   0% /run/user/24346
overlay          20G   13G  6.7G  65% /var/lib/docker/overlay2/f940e1999bcc005af42d28150a613480f6f92d4383ff4ac88cc1934d522da249/merged
tmpfs           796M     0  796M   0% /run/user/3518
bd808 changed the task status from Open to In Progress.Wed, Nov 12, 11:33 PM
bd808 claimed this task.

The profile::docker::engine::settings hiera variable can be used to configure Docker to rotate logs among other things. Deployment-prep currently does not have a global config to do that, but it should.

https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/8a1d22ecbf504f8b4fd1c6fd89847e70bab009fa%5E%21/#F0

diff --git a/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml b/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml
index b383826..e5a2f64 100644
--- a/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml
+++ b/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml

@@ -1,6 +1,10 @@
 profile::docker::engine::declare_service: true
 profile::docker::engine::packagename: docker.io
-profile::docker::engine::settings: {}
+profile::docker::engine::settings:
+  log-driver: json-file
+  log-opts:
+    max-file: 2
+    max-size: 50m
 profile::docker::engine::version: 18.09.1+dfsg1-7.1+deb10u2
 profile::docker::runner::service_defs:
   mediawiki-services-mobileapps:
root@deployment-docker-mobileapps02:~# puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(5d9f11b253) gitpuppet - puppetserver: Generalize git-rebase fix to work for labs/private'
Notice: /Stage[main]/Docker::Configuration/File[/etc/docker/daemon.json]/content:
--- /etc/docker/daemon.json     2024-05-21 11:46:08.372337600 +0000
+++ /tmp/puppet-file20251113-2620501-fr9hyt     2025-11-13 00:02:27.661783012 +0000
@@ -1,3 +1,8 @@
 {
+  "log-driver": "json-file",
+  "log-opts": {
+    "max-file": 2,
+    "max-size": "50m"
+  },
   "storage-driver": "overlay2"
 }

Notice: /Stage[main]/Docker::Configuration/File[/etc/docker/daemon.json]/content: content changed '{sha256}62acce55f885bdc2f634c83d3e3b14e9838f8115957d171aabf5574ab8c807a9' to '{sha256}ca36fda9316458ca1555de39c7a12356679db1f5266f18e94acc74b1f893b544'
Notice: Applied catalog in 8.04 seconds

Mentioned in SAL (#wikimedia-releng) [2025-11-13T00:04:14Z] <bd808> Reboot deployment-docker-mobileapps02 (T409979)

The 2 in the max-file data needs to be quoted:

Nov 13 00:04:27 deployment-docker-mobileapps02 dockerd[581]: unable to configure the Docker daemon with file /etc/docker/daemon.json: json: cannot unmarshal number into Go struct field Config.log-opts of type string

https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/d0d61dc325831793a2234b586d4d53556b410675%5E%21/#F0

diff --git a/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml b/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml
index e5a2f64..434eb0b 100644
--- a/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml
+++ b/deployment-prep/deployment-docker-mobileapps02.deployment-prep.eqiad1.wikimedia.cloud.yaml

@@ -3,7 +3,7 @@
 profile::docker::engine::settings:
   log-driver: json-file
   log-opts:
-    max-file: 2
+    max-file: '2'
     max-size: 50m
 profile::docker::engine::version: 18.09.1+dfsg1-7.1+deb10u2
 profile::docker::runner::service_defs: