Bootstrapping a new node into the cluster causes some portion of the dataset to be relocated to the joining node. Once complete, this data remains in place, though unreachable, on the source nodes, until it is gradually discarded by routine compaction, or explicitly purged by performing a cleanup.
Normally the number of nodes potentially requiring cleanup would be large (possibly all of them), and as result, the portion of data from each correspondingly small. However, as a result of the fact that we have 3 replicas spread across 3 racks, and that restbase1006 was recently bootstrapped into a rack with only one machine (restbase1005), all of the data now on restbase1006 exclusively came from restbase1005. You can see this by comparing the load values:
Datacenter: eqiad ================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.64.48.99 547.26 GB 256 ? 325e01e8-debe-45f0-a8c2-93b3baa58968 d UN 10.64.32.159 267.42 GB 256 ? 88d9ef9f-d81b-466e-babf-6a283b13f648 b UN 10.64.0.221 286.62 GB 256 ? fc041cc8-cd28-4030-b29a-05b9a632cafc a UN 10.64.48.100 281.9 GB 256 ? 2abf437d-a16d-406b-a6de-8d28b7dda808 d UN 10.64.0.220 273.84 GB 256 ? c021a198-b7f1-4dc2-94d7-9cb8b8a8df28 a UN 10.64.32.160 292.82 GB 256 ? 798ff758-8c91-46e0-b85e-dad356c46f20 b
In addition to obscuring the actual disk usage, this unreachable data also results in less optimal read performance, (the page cache won't go as far, for example).
TL;DR
We should run nodetool cleanup on restbase1005. I expect this to take some time, and generate some additional disk IO. Given the relatively low load on these hosts, I don't believe there is any danger of impacting the services, but out of an abundance of caution, we might do this at a less-than-peak time.