Page MenuHomePhabricator

Grafana reports ALL docker mounts in a spammy way
Closed, ResolvedPublic

Description

When a container starts, Docker mount a filesystem under /var/lib/docker/devicemmapper/mnt/. Docker creates these on the fly and they are all rather short lived.

Diamond eventually collect the disk space usages on those and report to Graphite.

Shinken monitors the graphite data and it ends up emitting an alarm because the metrics are outdated (no valid datapoints found).

They can be seen at https://grafana-labs.wikimedia.org/dashboard/db/labs-project-board?panelId=18&fullscreen&orgId=1&var-project=integration&var-server=integration-slave-docker-1001&var-server=integration-slave-docker-1002&var-server=integration-slave-docker-1003&var-server=integration-slave-docker-1004&from=now-90d&to=now

  1. These entries for the docker hosts will be filling up graphite I expect! a new metric or 2 will be added for every job run
  2. It makes the dashboard for disk usage pretty unusable.

There is a similar issue in production with the Icinga check_disk probe T178454.

Event Timeline

Addshore created this task.Sep 29 2017, 9:48 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 29 2017, 9:48 AM
hashar added subscribers: fgiunchedi, hashar.EditedSep 29 2017, 12:16 PM

For T1075: Audit groups of metrics in Graphite that allocate a lot of disk space @fgiunchedi did a tweak to disable reporting diskI/O from partitions:

modules/diamond/manifests/init.pp
diamond::collector { 'DiskSpace':
    settings => {
        filesystems => 'ext2,ext3,ext4,xfs,fuse.fuse_dfs,fat32,fat16,btrfs',
    },
}

I guess the DiskSpace collector can be tweaked in such a way? It has:

filesystems

filesystems to examine
Default: 'ext2, ext3, ext4, xfs, glusterfs, nfs, nfs4, ntfs, hfs, fat32, fat16, btrfs'

exclude_filters

A list of regex patterns. Any filesystem matching any of these patterns will be excluded from disk space metrics collection.
Example:

exclude_filters = ^/boot, ^/mnt

Default: '^/export/home'

So I guess we could exclude .*docker/devicemapper/.*

That sounds like a good solution.

Addshore moved this task from Unsorted 💣 to Watching 👀 on the User-Addshore board.
Krinkle moved this task from Inbox to Grafana on the Graphite board.Nov 21 2017, 10:53 PM

Change 393215 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] diamond: skip DiskSpace for Docker containers

https://gerrit.wikimedia.org/r/393215

Change 393215 merged by Alexandros Kosiaris:
[operations/puppet@production] diamond: skip DiskSpace for Docker containers

https://gerrit.wikimedia.org/r/393215

hashar closed this task as Resolved.Dec 1 2017, 8:34 AM
hashar claimed this task.
hashar reopened this task as Open.Dec 7 2017, 12:12 PM
hashar closed this task as Resolved.