Page MenuHomePhabricator

2021-03-05: tools nfs share cleanup
Closed, ResolvedPublic

Description

In the tradition of T247315, and given that we got paged again, we should cleanup to prevent collapse.

Related wiki runbook: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Shared_storage#NFS_volume_cleanup

Event Timeline

dcaro triaged this task as High priority.Mar 5 2021, 9:12 AM
dcaro created this task.

@Zache Hi! As maintainer of the fiwiki-tools project (https://toolsadmin.wikimedia.org/tools/id/fiwiki-tools), can you review and truncate/remove some of the log files in your project?
They are really big and we are running out of space (there's specially one of >26 GB)

Thanks!

@Jc86035 Hi! As maintanier of the archive-things-4 tool (https://toolsadmin.wikimedia.org/tools/id/archive-things-4), we have found that you have a lot of temporary files (~40k) using >100GB of space, can you review if those are needed or can be cleaned up?
We are running out of space.

Thanks!

Mentioned in SAL (#wikimedia-cloud) [2021-03-23T00:09:19Z] <bstorm> truncated the err output file /project/data/suha/stubnat.err because it was exceeding 26GB T276525

Mentioned in SAL (#wikimedia-cloud) [2021-03-26T18:03:40Z] <bstorm> truncated the 27G error.log file T276525

Mentioned in SAL (#wikimedia-cloud) [2021-03-26T18:06:49Z] <bstorm> truncated 20G error log file T276525

NFS space on tools is now at 74% usage. That's a lot healthier. I'll close this for now. Unfortunately, there's still a lot of verbose logging out there.