Page MenuHomePhabricator

persondata tool is using a large amount of NFS space -- please try to clean it up
Closed, ResolvedPublic

Description

According to https://grafana.wikimedia.org/d/50z0i4XWz/tools-overall-nfs-storage-utilization?viewPanel=4&orgId=1
persondata is currently using over half a TB of space. Please delete unnecessary data and perhaps compress some things.

Event Timeline

Bstorm created this task.
Bstorm moved this task from Inbox to Watching on the cloud-services-team (Kanban) board.
Bstorm moved this task from Backlog to Shared Storage on the Data-Services board.

The main disk consumer is $HOME/logs and specifically these logs:

$ ls -alhS | head -5
total 475G
-rw-rw----  1 51412 51412  392G Aug 13 15:49 wikidata_sitelinks.out
-rw-rw----  1 51412 51412   75G Aug 20 03:16 person_bkl2.out
-rw-rw----  1 51412 51412  2.4G Aug 20 06:55 beacon_dewiki_commons.out
-rw-rw----  1 51412 51412  1.3G Aug 26 03:45 infobox.out
-rw-rw----  1 51412 51412  1.1G Jul 25 13:27 process_templatedata.out
-rw-rw----  1 51412 51412  815M Aug 23 00:49 bild-tags.out

I have deleted these log files to reclaim the space on the shared NFS system.

@Wurgl I wish I had a better solution to offer you here, but I think you should add a log cleanup job to you tool's long list of cron tasks that does some periodic truncation. You could get a bit fancy since you are already setting custom log destinations for each job and make a new log per day or month and then have a cleanup that deleted logs based on last modified date.

This tool is no longer in the top 10.