Page MenuHomePhabricator

templatetiger is using 1.2TB out of 8T available Tools
Closed, ResolvedPublic

Description

CC @Kolossos @Bgwhite

Please consider cleaning up large files:

$ find . -size +1024M -printf "%k %p\n" | sort -rn
23668160 public_html/dumps/svwiki-2018-06-20.txt
23645912 public_html/dumps/svwiki-2018-07-01.txt
23642716 public_html/dumps/svwiki-2018-07-20.txt
23635556 public_html/dumps/svwiki-2018-08-01.txt
23611836 public_html/dumps/svwiki-2018-08-20.txt
23552132 public_html/dumps/svwiki-2018-09-01.txt
23510144 public_html/dumps/svwiki-2018-09-20.txt
23499916 public_html/dumps/svwiki-2018-10-01.txt
23478044 public_html/dumps/svwiki-2018-10-20.txt
10375756 public_html/dumps/frwiki-2018-10-01.txt
...

Issue previously reported in:

Event Timeline

GTirloni triaged this task as High priority.Oct 31 2018, 6:37 PM
GTirloni created this task.

Additionally, templatetiger is using 271GB in its database.

Please consider truncating/trimming that data as well.

GTirloni renamed this task from templatetiger is using 911GB out of 8T available Tools to templatetiger is using 1.2TB out of 8T available Tools.Nov 27 2018, 1:15 PM

@Kolossos It's been a month and we haven't been able to get in touch with you (your email at toolserver.org is bouncing and we haven't been lucky with your talk page -- please keep your contact information up to date, if possible).

If you see this message, please help us find a permanent solution to this problem. Templatetiger is using 1.2 Terabytes of data, which is way too much for a single tool and this has been a recurring problem. Toolforge is a shared platform we would like to ensure it's available for all tool authors.

If you aren't able to look into the disk space issue yourself, are you okay with us setting up a cron job to delete files older than 30 days in the ~/public_html/dumps directory?

Finally, If you don't have the time to work on this tool anymore, could you let us know what are your future plans for it?

Mentioned in SAL (#wikimedia-cloud) [2018-12-01T02:08:24Z] <gtirloni> temporarily stopped tool (T208456)

@Kolossos please get back to us as soon as you can. I've kept the tool running and it was deemed to heavy handed. I'm sorry.

GTirloni changed the task status from Open to Stalled.Dec 1 2018, 2:23 AM
GTirloni removed a subscriber: GTirloni.Dec 1 2018, 11:00 AM
Kolossos added a subscriber: GTirloni.EditedDec 1 2018, 8:44 PM

Clean up done.

@Bgwhite : Could you please stop filling this directory until we found a solution for T184126 .

If we are together with WMF are not able to solve T184126, the tool is in my eyes dead and would only keep it running with old data for some time.

@GTirloni : A cron job would be fine for me to delete old files.

@Kolossos I've added a script cleanup.sh to the templatetiger home dir that does the following:

  • Moves files older than 30 days from the public_html/dumps folder to the public_html/dumps_archive folder
  • Deletes files older than 90 days from the public_html/dumps_archive folder

And I've configured it to run on the 1st day of each month at 8AM. Hopefully that should be enough to keep these files under control and the archiving gives you a chance to catch any files that would be deleted but are still needed. If the archiving feature isn't necessary, let me know and I can remove it from the script.

Thanks for getting back to us and sorry for all the nagging. Sometimes we can be a bit annoying when trying to keep Toolforge tidy :)

Templatetiger is now using 93GB.

GTirloni closed this task as Resolved.Dec 2 2018, 7:53 PM

@GTirloni : Thanks for the cleanup script.