Page MenuHomePhabricator

2021-08-05: Tools NFS share cleanup
Closed, ResolvedPublic

Description

The space on tools just crossed the line and tripped the alarm.

Notification Type: PROBLEM

Service: NFS Share Volume Space /srv/tools
Host: labstore1004
Address: 10.64.37.19
State: CRITICAL

Date/Time: Thu Aug 5 17:29:22 UTC 2021

Notes URLs: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Shared_storage%23NFS_volume_cleanup https://grafana.wikimedia.org/d/50z0i4XWz/tools-overall-nfs-storage-utilization?orgId=1

Acknowledged by :

Additional Info:

DISK CRITICAL - free space: /srv/tools 1241687 MB (15% inode=78%):

In the tradition of T284964, we'll clean it up.

Actual current usage is /dev/drbd4 8.0T 6.5T 1.2T 85% /srv/tools despite what icinga thinks

Event Timeline

Bstorm triaged this task as Medium priority.Aug 5 2021, 5:58 PM
Bstorm created this task.

Found a runaway file: 995G /srv/tools/shared/tools/project/iabot/Workers/Worker2.out

Mentioned in SAL (#wikimedia-cloud) [2021-08-05T21:50:56Z] <bstorm> truncated 995G Workers/Worker2.out T288276

Large NFS projects aren't too bad now that I got rid of that one file:

71G     /srv/tools/shared/tools/project/digero
74G     /srv/tools/shared/tools/project/mix-n-match
78G     /srv/tools/shared/tools/project/eatchabot
80G     /srv/tools/shared/tools/project/iabot
104G    /srv/tools/shared/tools/project/templatehoard
107G    /srv/tools/shared/tools/project/phetools
111G    /srv/tools/shared/tools/project/panoviewer
114G    /srv/tools/shared/tools/project/wikidata-analysis
174G    /srv/tools/shared/tools/project/zoomviewer
198G    /srv/tools/shared/tools/project/glamtools

That's the largest project dirs. Most of iabot is actually that file which is now stuck at over 70 GB.

It's back up to 82%. Checking on things.

Mentioned in SAL (#wikimedia-cloud) [2021-09-12T17:04:16Z] <bstorm> truncated 58gb error.log file T288276

Mentioned in SAL (#wikimedia-cloud) [2021-09-12T17:06:06Z] <bstorm> truncating 45G reflinks.err file T288276

Mentioned in SAL (#wikimedia-cloud) [2021-09-12T17:07:37Z] <bstorm> truncated 77G Worker2.out T288276

Mentioned in SAL (#wikimedia-cloud) [2021-09-12T17:06:06Z] <bstorm> truncating 45G reflinks.err file T288276

I have deleted my file and created a job to regularly clear it. Sorry for inconvenience

Ok, we are now at /dev/drbd4 8.0T 6.5T 1.2T 86% /srv/tools
:(
I'm going to run a large du across group projects, which is going to cause some iowait, but I don't see another way to get the info needed quickly.

Ok, here's what I've got for the top projects:

42G     /srv/tools/shared/tools/project/lbenedix
46G     /srv/tools/shared/tools/project/nyandata
48G     /srv/tools/shared/tools/project/splinetools
49G     /srv/tools/shared/tools/project/render
52G     /srv/tools/shared/tools/project/render-tests
54G     /srv/tools/shared/tools/project/freddy2001
56G     /srv/tools/shared/tools/project/persondata
57G     /srv/tools/shared/tools/project/ifttt
68G     /srv/tools/shared/tools/project/cluebotng
72G     /srv/tools/shared/tools/project/digero
79G     /srv/tools/shared/tools/project/mix-n-match
84G     /srv/tools/shared/tools/project/eatchabot
111G    /srv/tools/shared/tools/project/panoviewer
111G    /srv/tools/shared/tools/project/phetools
112G    /srv/tools/shared/tools/project/templatehoard
114G    /srv/tools/shared/tools/project/wikidata-analysis
180G    /srv/tools/shared/tools/project/zoomviewer
220G    /srv/tools/shared/tools/project/glamtools
524G    /srv/tools/shared/tools/project/iabot

The vast majority of that space in iabot are worker out files. It's just overly verbose logging.

That alone put us to /dev/drbd4 8.0T 6.0T 1.7T 79% /srv/tools

Bstorm claimed this task.

Going to close this just so it isn't open forever. It's already at 80% so a new one will be needed soon.