Page MenuHomePhabricator

bytemark dump mirror index.html file is out of date
Open, NormalPublic

Description

Well this isn't dumps generation, strictly speaking, but anyways...

A user reported on the xlmldatadumps-l mailing list that the bytemark mirror is out of date. Upon closer inspection, it turns out that the index.html file has not been updated, though the dumps themselves have been pulled regularly.

The maintainer of the mirror has been pulling from your.org, a fine idea. But the index.html file is perhaps not included in the rsync.

I note that the rsync file lists are synced to our mirrors, so that e.g. https://dumps.wikimedia.your.org/rsync-filelist-last-2-good.txt is available and current. It lists both backup-index.html and backup-index-sorted.html which should be helpful. Actually, picking up current copies of ll *html files in the list would be good, since it's possible that our list of mirrors, our legal notice and so on might be updated from time to time.

Event Timeline

ArielGlenn triaged this task as Normal priority.Mar 4 2019, 12:35 PM
ArielGlenn created this task.
ArielGlenn added a comment.EditedMar 4 2019, 12:40 PM

Sample stanza:

[dumpslastthree]
read only = true
# this includes only the last three good dumps.
path = /srv/dumps/xmldatadumps/public
include = /*wik*/
exclude = **tmp/ **temp/ **bad/ **save/ **other/ **archive/ **not/  *.inprog /* /*/ /*/*/
include from = /srv/dumps/xmldatadumps/public/rsync-inc-last-3.txt

That's the relevant part of our rsync config; if you're pulling from somewhere else, odds are they won't have that setup, so you'll have to futz with settings on the client side.

ArielGlenn moved this task from Backlog to Active on the Dumps-Generation board.Mar 4 2019, 12:42 PM

Hey @Reedy have you had a chance to look at this yet?

Erm @Reedy? If you can't get to it just now, understandable, maybe you can give an ETA?

Er. @Reedy, got a few minutes to play with this?

Well it's a new month and time for a new ping. @Reedy? :-)

@Reedy I see that not only the index is out of date but that dumps have stopped being mirrored to the site, as of August this year. Might you have a look? E.g. https://wikimedia.bytemark.co.uk/commonswiki/

Thanks!

Reedy added a comment.Mon, Nov 11, 6:48 PM

Server is down due to a raid controller failure... will see what happens next