Right now we keep them forever, and we shouldn't.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
cleanup misc dumps that aren't stored in per-date urls | operations/puppet | production | +12 -4 |
Event Timeline
This is blocked until Legoktm can archive url counts from the older dumps, for some stats he generates.
Everything is done on my side for https://shorturls.toolforge.org/. Still waiting to hear how long https://github.com/Hydriz/Balchivist/issues/8 is expected to take...if it's going to be a while I'll set up a manual cronjob in the meantime.
The actual system will take a while but let me try to manually get the old files uploaded first, which should take about 2-3 weeks. Approximately how many old copies of this dump will we be keeping, if we are not keeping all of them here?
Unsure, and it may vary over time. I'm going to arbitrarily say 20 for now, that seems like a lot.
Just an update that I have got the files archived to the Internet Archive: https://archive.org/search.php?query=subject%3A%22shorturls%22%20AND%20subject%3A%22wikimedia%22
Next step for me is to probably get a cronjob running, but I will be mainly focusing on getting the new system up since it is designed for such cases.
@ArielGlenn You can go ahead to keep the latest 20 dumps (or less, depending on your requirements).
Funnily enough we are already configured to keep only 7 shorturl dumps, so it is just "lucky" that the script did not work for the specific directory layout. I'll fix that now though :-)
Change 619571 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] cleanup misc dumps that aren't stored in per-date urls
Change 619571 merged by ArielGlenn:
[operations/puppet@production] cleanup misc dumps that aren't stored in per-date urls
The above patch is now deployed; I'll check tomorrow to make sure that the older files are actually cleaned up on the labstore hosts before closing the task.
To-morrow, and to-morrow, and to-morrow,
Creeps in this petty pace from day to day,
with nothing to remind me that I should check those files or close this bug.
So, almost a month later... yep the files are cleaned up, closing this task!