Page MenuHomePhabricator

The /srv volume is full on an-launcher1002
Closed, ResolvedPublic

Assigned To
Authored By
BTullis
Jun 13 2023, 6:53 PM
Referenced Files
F37102733: image.png
Jun 13 2023, 6:57 PM
F37102731: image.png
Jun 13 2023, 6:57 PM
F37102729: image.png
Jun 13 2023, 6:57 PM
F37102727: image.png
Jun 13 2023, 6:53 PM
Subscribers

Description

image.png (365×490 px, 43 KB)

Disk full on an-launcher1002 - in the /srv/partition

Event Timeline

BTullis triaged this task as High priority.Jun 13 2023, 6:54 PM
BTullis moved this task from Incoming to In Progress on the Data-Platform-SRE board.

The vast majority of the space is taken up by /srv/airflow-analytics

image.png (222×579 px, 20 KB)

Within there are 73.9 GB of logs.

image.png (243×583 px, 22 KB)

The recent scheduler logs are all about 1.4 GB is size and don't seem to be rotated very frequently.

image.png (589×491 px, 29 KB)

I'll see what I can delete.

There is a logrotate fragment for airflow_analytics, but I don't think it's catching the files that are in their own subdirectories per name;.

# logrotate(8) config for airflow_analytics_clean_logs

/var/log/airflow_analytics_clean_logs/*.log {
    daily
    copytruncate
    missingok
    compress
    delaycompress
    notifempty
    rotate 15
    size 256M
}

Mentioned in SAL (#wikimedia-analytics) [2023-06-13T19:03:42Z] <btullis> freeing up space in /srv on an-launcher1002 with btullis@an-launcher1002:/srv/airflow-analytics/logs/scheduler$ find -maxdepth 1 -type d -mtime +15 -print0 | xargs -0 sudo rm -rf for T339002

We should check log rotation settings on airflow instances, to make sure they are correct and working as we expect.