Page MenuHomePhabricator

fix up log retention on log collection/storage hosts
Open, HighPublic

Description

The following directories on log-collecting hosts need cron jobs to purge old files:

stat1002: /a/log/webrequest/archive (though it currently has from Feb 1 on, I see no purge mechanism around)
erbium:/a/log/webrequest/archive
oxygen:/a/log/webrequest/archive
fluorine: /a/mw-log/archive
gadolinium: /a/log/webrequest/archive

On stat1002 the following need to be cleaned up:
/a/squid/stats/dev_engels/csv/*/*/private/*
/a/quid/stats/csv/*/*/private/*
/a/squid/stats_editors/csv/*/*/private/*

On stat1003 the following need to be cleaned up:
/a/zerosms/logs

On both stat1002 and stat1003 we need to ask ezachte about these:
/a/wikistats_git/squids/csv_edits
/a/wikistats_git/squids/csv
/a/wikistats_git/squids/csv_editors

e.g. 2014-05/2014-05-13/private/DebugSquidDataOutDoNotPublish.txt

and also /a/wikistats_git/backup/

Event Timeline

ArielGlenn claimed this task.
ArielGlenn raised the priority of this task from to High.
ArielGlenn updated the task description. (Show Details)
ArielGlenn added a project: acl*sre-team.
ArielGlenn added subscribers: ArielGlenn, Ottomata.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 16 2015, 3:18 PM

have a changeset for fluorine somewhere around here:
https://gerrit.wikimedia.org/r/#/c/195917/

ArielGlenn set Security to None.Mar 16 2015, 3:25 PM
ArielGlenn added a subscriber: ezachte.

Erik, I added you so you can comment on the /a/wikistats_git/ files.

Ottomata, what do you think about making these have 90 days instead of 180?
puppet/templates/udp2log/logrotate_udp2log_analytics.erb
puppet/templates/udp2log/logrotate_udp2log.erb

+1,

but I'm pretty sure the *_analytics.erb one is not used at all.

on oxygen, all logs in /a/log/webrequest/archive/ of the form zero-*.tsv.log-*gz e.g. zero-digi-malaysia.tsv.log-20130711.gz will have to be removed manually, logrot doesn't see those.

on gadolinium, in /a/log/webrequest/archive/, all logs of the form

edits.tsv.log-*  
mobile-sampled-100.tsv.log-*   
sampled-1000.tsv.log-*  
5xx.tsv.log-*

will need to be deleted manually, logrot won't see those.

https://gerrit.wikimedia.org/r/#/c/197081/ for logs on stat1002 but maybe it's not needed?

asked Yurik (adding him) about /a/zerosms/logs, he will be able to clean that up next week, it needs careful review by him.

on oxygen, all logs in /a/log/webrequest/archive/ of the form zero-*.tsv.log-*gz e.g. zero-digi-malaysia.tsv.log-20130711.gz will have to be removed manually, logrot doesn't see those.

I removed files older than +90 days in /a/log/webrequest/archive on gadolinium. Why won't logrotate see these though?

Dzahn added a parent task: Restricted Task.Jun 21 2016, 5:35 AM
Dzahn added a subscriber: Dzahn.Jun 21 2016, 5:38 AM

also T87792 and T84618 and T114395

I believe we are in a better place nowadays wrt logs retention in the fleet, ok to resolve or lower priority?

Aklapper removed ArielGlenn as the assignee of this task.Jun 19 2020, 4:19 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)