The majority of this is in two directories:
75G sql
268G mw-log
mw-log in particular is a bit of a mess. It has 2247765 files in it that all seem to be some kind of historical activity log. That is 2+ million log files in a single directory on NFS.
sql has the majority of content in dumps at 75G (just this directory would put this tool in the top 15 of all storage users)
75G dumps
This consists of 920 directories with .sql files for several wiki's going back to 2013:
93M 20131019093307
98M 20131020000022
project/liangent-php/sql# ls -al dumps/20160525000017 total 78024 drwxr-sr-x 2 51117 51117 4096 May 25 00:03 . drwxr-sr-x 917 51117 51117 36864 May 25 00:00 .. -rw-r--r-- 1 51117 51117 7321424 May 25 00:00 arwiki.sql -rw-r--r-- 1 51117 51117 692300 May 25 00:01 commonswiki.sql -rw-r--r-- 1 51117 51117 1654262 May 25 00:01 enwiki.sql -rw-r--r-- 1 51117 51117 713699 May 25 00:02 mediawikiwiki.sql -rw-r--r-- 1 51117 51117 94141 May 25 00:02 testwikidatawiki.sql -rw-r--r-- 1 51117 51117 1430897 May 25 00:02 wikidatawiki.sql -rw-r--r-- 1 51117 51117 58744 May 25 00:03 zhwikisource.sql -rw-r--r-- 1 51117 51117 67862705 May 25 00:03 zhwiki.sql
There are also 153 error and out files that go back to 2013.
Can we clean out mw-log and what can we do to prevent this kind of growth in the future?
Can we clean out sql/dumps (especially for content that is now years old)?
What is the data retention policy for this tool?