Page MenuHomePhabricator

Perform cleanups to reclaim space from recent topology changes
Closed, ResolvedPublic

Description

Recent changes to content topology have left the nodes in eqiad with "extra" data, data which is no longer associated with them according to the current partitioning. The prescribed solution is a nodetool cleanup. Normally I'd recommend waiting until the final topology is in place to avoid double-handling, but space on these nodes is becoming quite tight.

I propose we initiate concurrent cleanups, one per rack (to limit any additional compaction-induced latency impact).

HostRackSequenceStatus
restbase1002.eqiad.wmneta1Cancelled (see: https://phabricator.wikimedia.org/T121535#1887497)
restbase1001.eqiad.wmneta2Started ~2015-12-17 18:06:00 (https://tools.wmflabs.org/sal/log/AVGxFO_i1oXzWjit6-Du)
restbase1007-a.eqiad.wmneta3Freshly boostrapped
restbase1003.eqiad.wmnetb1Complete
restbase1004.eqiad.wmnetb2Decommissioning (hands-off)
restbase1008-a.eqiad.wmnetb-Freshly bootstrapped
restbase1005.eqiad.wmnetd1Complete; Complete
restbase1006.eqiad.wmnetd2Complete
restbase1009-a.eqiad.wmnetd3Complete

Edit:

Resequence 1002 above 1001, as disk space is higher there.

Event Timeline

Eevans claimed this task.
Eevans raised the priority of this task from to High.
Eevans updated the task description. (Show Details)
Eevans added subscribers: Joe, fgiunchedi, GWicke and 4 others.
Eevans set Security to None.

Space is getting quite tight here; For example, with 1003 cleaned, there is 1.3T of free space, but after the current stream from 1004 completes, that will be reduced to ~300G. For 1008-a (1008 has 1T less disk, but 128 fewer tokens), there will be ~200G after the decommission of 1004 (and there is nothing to be gained from a cleanup).

There isn't much space to be gained from clearing old snapshots, but the ones I have looked at appear to be quite old and AFIAK, Not Needed. If no one has any objections, I will clear them.

Objections @fgiunchedi, @GWicke, @mobrovac?

restbase1002.eqiad.wmnet: Total TrueDiskSpaceUsed: 45.14 GB
restbase1003.eqiad.wmnet: Total TrueDiskSpaceUsed: 53.41 GB
restbase1005.eqiad.wmnet: Total TrueDiskSpaceUsed: 52.56 GB
restbase1006.eqiad.wmnet: Total TrueDiskSpaceUsed: 57.19 GB
restbase2003.codfw.wmnet: Total TrueDiskSpaceUsed: 7.9 GB
restbase2004.codfw.wmnet: Total TrueDiskSpaceUsed: 8.29 GB
restbase2005.codfw.wmnet: Total TrueDiskSpaceUsed: 8.15 GB
restbase2006.codfw.wmnet: Total TrueDiskSpaceUsed: 8 GB

There isn't much space to be gained from clearing old snapshots, but the ones I have looked at appear to be quite old and AFIAK, Not Needed. If no one has any objections, I will clear them.

+1 from me.

I wonder where these snapshots come from. We don't have any regular snapshotting set up, so these should be triggered manually. It would be good if we made sure to delete snapshots not long after creating them.

I went ahead and cleared snapshots across the cluster.

For the record, if disk space is tight from compactions or cleanups, the best way to temporarily clean up space is to run nodetool stop -- COMPACTION to abort currently ongoing regular compactions, and nodetool stop -- CLEANUP to abort currently running cleanups. After running this command, autocompaction will restart by default. In the very worst case we could also disable that, but it'll have performance implications if compactions are disabled for long.

In Cassandra 2.2 there is also an ability to kill specific compactions, which would make playing this particular game a bit more efficient. However, not playing it in the first place by having enough disk space would be much preferable, in my opinion. See T121575.

I had cancelled cleanup on 1002

restbase1002:~$ nodetool compactionstats -H
pending tasks: 2
   compaction type                               keyspace   table   completed     total    unit   progress
        Compaction   local_group_wikipedia_T_parsoid_html    data      2.1 GB   3.73 GB   bytes     56.34%
           Cleanup   local_group_wikipedia_T_parsoid_html    data     1.32 TB   1.86 TB   bytes     71.04%
Active compaction remaining time :   0h00m27s
restbase1002:~$ df -h
Filesystem                        Size  Used Avail Use% Mounted on
udev                               10M     0   10M   0% /dev
tmpfs                              13G  521M   13G   5% /run
/dev/md0                           28G  3.1G   23G  12% /
tmpfs                              32G     0   32G   0% /dev/shm
tmpfs                             5.0M     0  5.0M   0% /run/lock
tmpfs                              32G     0   32G   0% /sys/fs/cgroup
/dev/mapper/restbase1002--vg-var  2.7T  2.6T   20G 100% /var
Eevans updated the task description. (Show Details)