We have an alert about oversized Ceph MDS cache
btullis@cephosd1001:~$ sudo ceph status
cluster:
id: 6d4278e1-ea45-4d29-86fe-85b44c150813
health: HEALTH_WARN
1 MDSs report oversized cache
services:
mon: 5 daemons, quorum cephosd1001,cephosd1002,cephosd1003,cephosd1004,cephosd1005 (age 11w)
mgr: cephosd1001(active, since 11w), standbys: cephosd1002, cephosd1003, cephosd1004, cephosd1005
mds: 3/3 daemons up, 2 standby
osd: 100 osds: 100 up (since 11w), 100 in (since 15M)
rgw: 5 daemons active (5 hosts, 1 zones)
data:
volumes: 3/3 healthy
pools: 17 pools, 4481 pgs
objects: 15.93M objects, 39 TiB
usage: 145 TiB used, 1005 TiB / 1.1 PiB avail
pgs: 4476 active+clean
5 active+clean+scrubbing+deep
io:
client: 91 MiB/s rd, 76 MiB/s wr, 96 op/s rd, 1.47k op/s wrThis is likely caused by the recent increase in the usage of the dumps volume, since it correllated with the start of a dumps v1 run.
We can investigate this.
Documentation on cache configuration is here: https://docs.ceph.com/en/reef/cephfs/cache-configuration/
