Page MenuHomePhabricator

wikidumpparse is using 1.2TB of 5T available NFS misc storage
Closed, ResolvedPublic

Description

wikidumpparse /home is using a ton of storage at the moment. We are at very high utilization now, please clean up files/data and consider instituting ongoing clean up jobs so this doesn't happen again. Feel free to talk to any of us if you need help with clean up strategies. Thanks!

Event Timeline

madhuvishy created this task.

@notconfusing @Dfko @Hargup Hello! Poke on this task again, could you please clean up the home folder soon, thank you.

Hi, I am looking around for the offending files to delete them, but it has been a long while since I worked on any of this and I don't recall how to find them.
Can you provide the exact path of the offending files?

Also, sidenote, this service is probably not active so it is probably OK for you to delete the files; I will have a look myself this time but just wanted to mention it FYI.

Hi, I am looking around for the offending files to delete them, but it has been a long while since I worked on any of this and I don't recall how to find them.
Can you provide the exact path of the offending files?

Also, sidenote, this service is probably not active so it is probably OK for you to delete the files; I will have a look myself this time but just wanted to mention it FYI.

The site still seems to be up at https://wigi.wmflabs.org/

But our /home is 5T shared across all projects

nfs-tools-project.svc.eqiad.wmnet:/project/wikidumpparse/home nfs4 5.0T 3.6T 1.2T 75% /mnt/nfs/labstore-secondary-home

and /home/maximilianklein here is using >1T

root@wigi:/home# du -sh *
1.1G	hargup
776M	manydev
1.1T	maximilianklein
16K	prometheus
1.1G	vivekiitkgp
root@wigi:/home/maximilianklein# du -sh *
237M	alt_metric_wiki
55M	cocytus
563M	miniconda3
3.2M	my-wigi
85M	pwb
4.0K	replica.my.cnf
31G	snapshot_data
417M	snapshot_data_bak
4.0K	tmp
0	user-config.py
558M	WIGI
19M	WIGI-website
1.1T	Wikidata-Toolkit
442M	wiki_econ_capability
22M	wikistream

Please if you could clean this up it would help out. At some point we will be forced to remove.

Yes, I will investigate and delete to under 100GB by Monday January 8th

Thanks,
Max Klein

OK, I'm now using <100G. Deleted weekly Wikidata dumps going back to 2015.

@notconfusing Is this service still active? Are there ongoing clean up jobs in place to delete files that are generated? I see that the usage has now grown to 160G, and want to make sure we don't end up with really high utilization again. Thanks!

@madhuvishy Yes indeed, the service is still active and used by many
community members.

I added `rm -rf
/home/maximilianklein/Wikidata-Toolkit/wdtk-examples/dumpfiles/wikidatawiki/json*`,
in my script to delete the dumps after they're used.

Make a great day,
Max Klein ‽ http://notconfusing.com/