Page MenuHomePhabricator

eatchabot using a lot of NFS storage
Open, Needs TriagePublic

Description

/data/project/eatchabot/lr/downloaded_images is at 58G at this time. Since this is a shared storage system that is currently getting full, please see if there's data in there that can be purged. eatchabot is among the top space users at this time.

Thanks!

Event Timeline

Mentioned in SAL (#wikimedia-cloud) [2021-06-15T00:40:38Z] <bstorm> truncated 4GB uwsgi.log to free space T284968

Aklapper added a subscriber: Eatcha.

Both email addresses associated to the Phabricator account @Eatcha bounce, so notifications from this task will not reach @Eatcha.
https://en.wikipedia.org/wiki/User:Eatcha and https://meta.wikimedia.org/wiki/User:Eatcha say "retired".
@Eatcha has not been active since November 2021: https://guc.toolforge.org/?by=date&user=Eatcha
Thus removing task assignee.

taavi subscribed.
root@tools-nfs-2:~# du -sh /srv/tools/project/eatchabot/
67G	/srv/tools/project/eatchabot/

The lr sub-tool that is consuming the disk space has a main process that downloads files listed in https://commons.wikimedia.org/wiki/Category:License_review_needed and never deletes them. Seems like we should just delete them as abandoned tool cleanup.

bd808@tools-nfs-2:/srv/tools/project/eatchabot/lr/downloaded_images$ du -sh .
58G     .
bd808@tools-nfs-2:/srv/tools/project/eatchabot/lr/downloaded_images$ sudo find . -type f -delete
bd808@tools-nfs-2:/srv/tools/project/eatchabot/lr/downloaded_images$
5.2M    .

Most of the remaining file usage for this tool is in $HOME/www/python/src/image_hash_db.sqlite, a 7.3G database stored on NFS. :/

Just approved the adoption request, @mdaniels5757 is now the maintainer :)