Page MenuHomePhabricator

Compact Wikimetrics' old report files {dove}
Closed, ResolvedPublic

Description

Since April 2015 Wikimetrics backup is failing and alerting with errors like:

Error: Either failed to get lock on /data/project/wikimetrics/backup/wikimetrics1/hourly, or tar-ing failed.

The reason is that the public reports folder is getting too big: 5k+ folders, 600k+ files, 2.4GB.
The tar-ing of the MySQL and Redis databases takes less than one minute, but the tar-ing of the public reports folder takes more than 1 hour.
This causes the cron jobs to overlap and the consequent file lock issues.

The idea is to compact the old report files so that the tar-ing gets faster. I suppose that we could substitute a report tree:

303903
├── 2014-12-17
├── 2014-12-18
├── 2014-12-19
...

├── 2015-04-09
├── 2015-04-10
└── full_report.json

By a compressed file:

303903.tgz

This would reduce the size and number of files of the public reports folder.
We could implement this inside the daily_script, so that reports older than N days/months would be compressed.

Event Timeline

mforns created this task.Apr 10 2015, 8:24 PM
mforns raised the priority of this task from to Needs Triage.
mforns updated the task description. (Show Details)
mforns added a subscriber: mforns.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 10 2015, 8:24 PM
mforns moved this task from Next Up to In Progress on the Analytics-Kanban board.May 1 2015, 3:39 PM
kevinator renamed this task from Compact Wikimetrics' old report files to Compact Wikimetrics' old report files {dove}.May 1 2015, 5:07 PM
kevinator triaged this task as Normal priority.
kevinator set Security to None.

Change 208637 had a related patch set uploaded (by Mforns):
Reduce size of public reports folder

https://gerrit.wikimedia.org/r/208637

Change 208637 merged by Milimetric:
Reduce size of public reports folder

https://gerrit.wikimedia.org/r/208637

mforns closed this task as Resolved.May 15 2015, 3:43 PM