In the tradition of T247315, and given that we got paged again, we should cleanup to prevent collapse.
Related wiki runbook: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Shared_storage#NFS_volume_cleanup
In the tradition of T247315, and given that we got paged again, we should cleanup to prevent collapse.
Related wiki runbook: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Shared_storage#NFS_volume_cleanup
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | dcaro | T276525 2021-03-05: tools nfs share cleanup | |||
Resolved | Cyberpower678 | T278198 Workers/Worker1.out in iabot is huge - please truncate it | |||
Open | None | T278199 khanamalumat has a job that puts a lot of text in a log file when not doing any changes | |||
Resolved | Jc86035 | T272434 Toolforge tool 'archive-things-4' using very high disk space |
@Zache Hi! As maintainer of the fiwiki-tools project (https://toolsadmin.wikimedia.org/tools/id/fiwiki-tools), can you review and truncate/remove some of the log files in your project?
They are really big and we are running out of space (there's specially one of >26 GB)
Thanks!
@Jc86035 Hi! As maintanier of the archive-things-4 tool (https://toolsadmin.wikimedia.org/tools/id/archive-things-4), we have found that you have a lot of temporary files (~40k) using >100GB of space, can you review if those are needed or can be cleaned up?
We are running out of space.
Thanks!
Mentioned in SAL (#wikimedia-cloud) [2021-03-23T00:09:19Z] <bstorm> truncated the err output file /project/data/suha/stubnat.err because it was exceeding 26GB T276525
Mentioned in SAL (#wikimedia-cloud) [2021-03-26T18:03:40Z] <bstorm> truncated the 27G error.log file T276525
Mentioned in SAL (#wikimedia-cloud) [2021-03-26T18:06:49Z] <bstorm> truncated 20G error log file T276525
NFS space on tools is now at 74% usage. That's a lot healthier. I'll close this for now. Unfortunately, there's still a lot of verbose logging out there.