Page MenuHomePhabricator

Reduction of stat1005's disk space usage
Closed, ResolvedPublic5 Estimated Story Points


Icinga has been alarming for stat1005's disk space usage for the /srv partition. This is the current status:

/dev/mapper/stat1005--vg-data  7.2T  6.4T  385G  95% /srv

elukey@stat1005:/srv$ sudo du -hs *
31G	analytics-wmde
70G	deployment
1.3G	discovery
1.6M	event-schemas
4.1T	home
1.3T	log
16K	lost+found
513M	mediawiki
328M	published-datasets
184M	reportupdater
1.1T	stat1002-a

Top consumers in the /srv/home dir:

31G	ebernhardson
33G	jsamra
34G	nithum
38G	ashwinpp
69G	nuria
73G	dsaez
115G	ellery
147G	mirrys
240G	mkroetzsch
641G	flemmerich
2.5T	ezachte

Event Timeline

elukey triaged this task as High priority.Feb 8 2018, 8:26 AM
elukey created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 8 2018, 8:26 AM
elukey updated the task description. (Show Details)Feb 8 2018, 8:27 AM
elukey added a subscriber: Ottomata.Feb 8 2018, 1:05 PM

@Ottomata do you think that we can delete the stat1002-a's dir?

Yeah I think so! Maybe we can just shove it in HDFS for posterity?

elukey added a comment.Feb 8 2018, 3:04 PM

Yeah I think so! Maybe we can just shove it in HDFS for posterity?


elukey added a subscriber: ezachte.

We are trying to back up everything to HDFS before deleting the /srv/stat1002-a dir, but it would be great to delete data if possible to avoid wasting space:

elukey@stat1005:/srv/stat1002-a/user_dirs_from_stat1002$ sudo du -hs * | sort -h
4.0K	akrausetud
4.0K	arnad
4.0K	ebernhardson
4.0K	jdcc
4.0K	smalyshev
16K	milimetric
20G	nuria
40G	leila
73G	psinger
245G	ellery
655G	halfak

@Halfak, @leila: do you have any idea if we can safely delete this user directories on stat1005? They were the old user home directories on stat1002..

elukey added a comment.EditedFeb 21 2018, 10:18 AM

To free some space, now ellery's and psinger's stat1002 user dir are available in /mnt/hdfs/wmf/data/archive/backup/misc/stat1002-a/user_dirs_from_stat1002/

Now we have ~600GB back, 91% partition usage, much better :) If these dirs are not needed anymore it would be great to remove them from hdfs too.

elukey moved this task from Backlog to In Progress on the User-Elukey board.Feb 21 2018, 1:33 PM

I've dropped my usage to 44GB

Woops. Caught something else I don't need. Down to 12GB

elukey@stat1005:~$ df -h
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/stat1005--vg-data  7.2T  5.6T  1.3T  82% /srv

Way better now, thanks!

elukey lowered the priority of this task from High to Medium.Feb 22 2018, 6:37 AM

@ezachte Hi again :) any news about dropping some backup data?

leila added a comment.Feb 22 2018, 8:32 PM

@elukey I reviewed my share. All of it is related to one project, and I need to keep that data because we still refer to it and need to take subsets out of it from time to time. Is this causing an issue for you or you want to make sure if we don't need space, we don't lock it up?

@leila, if you have just random data that you don't want deleted, but is large, you could put it in your HDFS home directory and save it there :)

just a quick heads-up: I (finally) managed to regain access to stat1005, and will start cleanup tomorrow

Current status of home dirs:

115G    ellery
241G    mkroetzsch
648G    ezachte
754G    flemmerich
782G    dsaez
950G    mirrys

@leila, @Miriam, @diego: do you guys need all that space or can we do some clean ups? Don't want to force you to delete data that you need, only garbage that can be thrown away :)

@ezachte thanks a lot for the massive space reduction! \o/

@elukey: Done ;)

now 66G /home/dsaez/

@flemmerich Please check this thread.

elukey set the point value for this task to 5.Mar 15 2018, 10:58 AM
elukey@stat1005:~$ df -h
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/stat1005--vg-data  7.2T  4.2T  2.6T  62% /srv

Much better now, thanks to all! Please keep in mind that regular cleanups are really appreciated by Analytics folks even without any solicit :)

elukey moved this task from Next Up to Done on the Analytics-Kanban board.Mar 15 2018, 11:01 AM
elukey moved this task from In Progress to Done on the User-Elukey board.Mar 15 2018, 5:04 PM
Nuria closed this task as Resolved.Mar 26 2018, 9:27 PM