Page MenuHomePhabricator

arclamp-log.py prunes data too soon (after 30d instead of 90d)
Open, Needs TriagePublic

Description

Our current retention configuration is set to 336 hourly (14 days), and 90 daily, as declared at /etc/arclamp-log-excimer.yaml, which is provisioned from Puppet Hiera:

profile::webperf::arclamp::compress_logs_days: 3
profile::webperf::arclamp::retain_hourly_logs_hours: 336
profile::webperf::arclamp::retain_daily_logs_days: 90

It appears this isn't working correctly. For example, looking at once tag (load.php) from one channel (excimer-wall) the oldest SVG is 2023-02-15.excimer-wall.load.svgz, and the oldest trace log is 2023-02-15.excimer-wall.load.log.gz, as seen at https://performance.wikimedia.org/arclamp/logs/daily/ and https://performance.wikimedia.org/arclamp/svgs/daily/, and confirmed by our monitoring (Grafana dashboard: Arc Lamp) which indeed says the oldest file pruned is generally around the 3 week mark.

There seems to be a bug in arclamp-log.py somewhere that is pruning files too early, after only 1/3rd the configured retention has ellapsed.