Page MenuHomePhabricator

toolserver-home-archive is using 52G on Tools
Closed, DeclinedPublic

Description

This is largely due to:

-rw-r--r-- 1 tools.toolserver-home-archive tools.toolserver-home-archive 55744484648 Jun 11 2014 archive-2014-06-05.tar.xz

which iiuc is a dump of the old Toolservers home directories. It doesn't seem to have been touched or recovered since then, and is now nearly 2 years stale.

Can this be removed or otherwise moved to some offline storage?

Event Timeline

@Nemo_bis ping!

Any objections to removing this old archive?

What do you mean by offline storage, why would that be a benefit and what makes you think this is not used?

Storage somewhere that isn't actively using up NFS share space, or really anywhere anyone wants to make their own copy I imagine. NFS space is costly, we back it up, we replicate it for HA, etc. I don't know of anyone who has used it, or needs to use it, or has expressed interest in using it. Along with it being nearly 2 years old. I will send a note to labs-l seeking interested parties.

The point of having the archive there is that people usually don't know with large advance when they will need something from the archive. If the file is available, they can just extract what they need whenever they need it; if it's not available, they have to wait a few days until the archive is downloaded from archive.org.

The file can be moved to /data/scratch of course, but I notice that files tend to be deleted there without reason.

This file might actually have Toolserver data that was considered lost. I'm downloading it over the 6 days (ISP caps), but I'm out of the country for the next two weeks. So I can't look at it until then.

Thanks, Dispenser. I'd welcome some repackaged (thinner) version of the archive, there's surely some big portion of non-code. At the time I only managed to remove some 10 GB worth of templatetiger: https://archive.org/download/wikimedia-toolserver-home-2014-06-05

Examples of big files inside:

tools.toolserver-home-archive@tools-bastion-03:~$ tar tvaf archive-2014-06-05.tar.xz | grep -E " [0-9]{8,} 20"
-rw-r--r-- ant/users       11405246 2012-10-29 18:45 home/ant/public_html/wprn/htdocs/test.html
-rw-r--r-- apper/users     24011147 2009-11-19 23:36 home/apper/public_html/pd/PeEnDe.zip
-rw-r--r-- apper/users     78121742 2013-07-25 12:59 home/apper/public_html/pd/dump/pd.sql
-rw-r--r-- apper/users     74365290 2014-06-04 08:01 home/apper/public_html/pd/misc/pd_dump.txt
-rw-r--r-- apper/users     25033270 2011-09-06 17:49 home/apper/public_html/dpa/wartung/meldungen/201005iptc.tgz
-rwxr-xr-x bryan/users     19306394 2007-02-24 19:45 home/bryan/public_html/poty2006/images/Orion_Nebula_-_Hubble_2006_mosaic_18000.jpg
-rwxr-xr-x bryan/users     11902536 2007-02-24 20:15 home/bryan/public_html/poty2006/images/Venice_Lagoon_December_9_2001.jpg
-rwxr-xr-x bryan/users     20507980 2007-02-24 20:12 home/bryan/public_html/poty2006/images/World_Map_1689.JPG
-rw-r--r-- bryan/users     30666977 2008-03-23 20:00 home/bryan/public_html/stats/search/pagecounts-20080323-200000.gz
-rw-r--r-- bryan/users     21739974 2007-08-11 21:16 home/bryan/public_html/stuff/users_without_page.txt.bz2
-rw-r--r-- dab/users        165241125 2014-02-26 13:10 home/dab/public_html/gnu-pg/_wm-general.gpg.asc
-rw-r--r-- dab/users         74878015 2014-03-26 14:31 home/dab/public_html/gnu-pg/wpde.gpg.asc
-rw-r--r-- dab/users         17696366 2014-02-26 13:10 home/dab/public_html/gnu-pg/wpcommons.gpg.asc
-rw-r--r-- dab/users         72937824 2014-02-26 13:10 home/dab/public_html/gnu-pg/wpen.gpg.asc
-rw-r--r-- daniel/users      83751926 2007-11-10 19:17 home/daniel/public_html/misc/dewiki-page-cat.txt
-rw-r--r-- daniel/users      10190858 2008-03-17 16:13 home/daniel/public_html/misc/nlwiki_pages.txt
-rw-r--r-- daniel/users      21569099 2007-11-09 19:52 home/daniel/public_html/misc/dewiki-pages.txt
-rw-r--r-- daniel/users      31416320 2008-02-12 20:32 home/daniel/public_html/misc/wikidumps.tar
-rw-r--r-- daniel/users      20616578 2008-01-04 11:14 home/daniel/public_html/misc/frwiki-Montagne-dump.xml.gz
[...]
-rw-r--r-- dispenser/users   85265384 2012-05-01 03:16 home/dispenser/public_html/temp/dumps/olddb/jason_backups.sql

I doubt I can easily find something to exclude from the local copy of the archive... I guess I could uncompress the archive and use something like filelight.

Nemo_bis lowered the priority of this task from High to Low.Mar 28 2017, 9:08 PM

I guess I could uncompress the archive and use something like filelight.

Turns out I don't have enough disk space for this right now. Currently there's nothing to do here AFAICT. If specific measures are suggested, this can be reopened.