Page MenuHomePhabricator

Purge labs graphite metrics of Docker ephemeral partitions
Closed, ResolvedPublic

Description

When a container starts, Docker mount a filesystem under /var/lib/docker/devicemmapper/mnt/. Docker creates these on the fly and they are all rather short lived. Diamond collects them ( T177052 ) and they should all be purged from the labs graphite at some point.

The metrics to purge on labs are of the form:

labs
*.*.diskspace._var_lib_docker*

Or:
./*/*/diskspace/_var_lib_docker*

Event Timeline

That first need Diamond to stop collecting the ephemeral mounts which is T177052.

hashar triaged this task as Medium priority.EditedDec 1 2017, 8:44 AM

diamond: skip DiskSpace for Docker containers https://gerrit.wikimedia.org/r/#/c/393215/ got merged. I have rebased the puppetmaster on the CI puppetmaster and the patch got applied. So from now on Diamond should no more report metrics to *.*.diskspace._var_lib_docker_devicemapper_mnt_*. We can purge them from the Graphite servers.

Mentioned in SAL (#wikimedia-operations) [2017-12-01T10:56:55Z] <godog> delete docker diskspace metrics from labs - T181476

I've deleted the metrics but some instances are recreating those

/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:05 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1007/diskspace/_var_lib_docker/byte_percentfree.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:06 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1002/diskspace/_var_lib_docker/byte_percentfree.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:06 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1002/diskspace/_var_lib_docker/inodes_used.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:08 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1007/diskspace/_var_lib_docker/byte_avail.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:22 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1013/diskspace/_var_lib_docker/inodes_free.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:22 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1005/diskspace/_var_lib_docker/byte_free.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:39 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1019/diskspace/_var_lib_docker/inodes_used.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:40 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1001/diskspace/_var_lib_docker/byte_percentfree.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:43 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1017/diskspace/_var_lib_docker/inodes_free.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:44 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1017/diskspace/_var_lib_docker/byte_free.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:44 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1017/diskspace/_var_lib_docker/inodes_used.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:49 :: creating database file /srv/carbon/whisper/tools/tools-paws-master-01/diskspace/_var_lib_docker/byte_percentfree.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:49 :: creating database file /srv/carbon/whisper/tools/tools-paws-master-01/diskspace/_var_lib_docker/inodes_free.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:49 :: creating database file /srv/carbon/whisper/tools/tools-paws-master-01/diskspace/_var_lib_docker/inodes_percentfree.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:53 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1006/diskspace/_var_lib_docker/inodes_percentfree.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:01:53 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1006/diskspace/_var_lib_docker/inodes_used.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)
/var/log/carbon/carbon-cache@h/creates.log:01/12/2017 11:02:03 :: creating database file /srv/carbon/whisper/tools/tools-paws-worker-1002/diskspace/_var_lib_docker/inodes_avail.wsp (archive=[(60, 10080), (300, 4032), (900, 2880), (3600, 8760), (86400, 1825)] xff=0.01 agg=None)

Ah sorry, I forgot toolforge has some Docker hosts as well :( So I guess the tools puppet master has not rebased yet or the tools PAWS instances have not ran puppet yet.

@fgiunchedi can you try deleting /srv/carbon/whisper/tools/tools-paws-*/diskspace again and see whether they are being created? If so I guess we will need someone to look at the tools instances and check whether the Diamond configuration got properly applied.

@hashar tried just now and yes they are being recreated

I have checked tools-paws-worker-1002.tools.eqiad.wmflabs and puppet had just run.

The Diamond configuration file for DiskSpace is from Dec 1 07:43:

/etc/diamond/collectors/DiskSpaceCollector.conf
# DiskSpace Diamond Collector configuration
# This file is managed by Puppet.
enabled = true
exclude_filters = ^/var/lib/docker/,^/run/docker/
filesystems = ext2,ext3,ext4,xfs,fuse.fuse_dfs,fat32,fat16,btrfs

If I run it manually with diamond --foreground --log-stdout:

1512472610.43	[DiskSpaceCollector:10471:DEBUG]	Ignoring /var/lib/docker/overlay2 since it is in the exclude_filter list.

Looking again at the carbon creation, they are metrics for /var/lib/docker and I guess we want that one to be collected, while the sub directories are ignored (eg: /var/lib/docker/devicemapper* or /var/lib/docker/overlay2/*.

So I think the confusion is in my original request which should have mentioned the sub partitions instead:

- *.*.diskspace._var_lib_docker*
+ *.*.diskspace._var_lib_docker_*
                              ^^^

Searching on https://graphite-labs.wikimedia.org/ for *.*.diskspace._var_lib_docker_*.inodes_free yields a few mounts solely for build05.packaging.eqiad.wmflabs.

I dont have access to that instance though.

fgiunchedi claimed this task.

@hashar got it! I've deleted the remaining metrics from builder05/builder08 and we should be good now

labmon1001:/srv/carbon/whisper$ sudo rm -rf ./packaging/builder08/diskspace/_srv_docker_devicemapper_mnt_* packaging/builder05/diskspace/_var_lib_docker_devicemapper_mnt_*

Tentatively resolving!