Page MenuHomePhabricator

Broken links on dump index page
Closed, DeclinedPublic

Description

Author: vasco

Description:
Hi,

I have been trying to download an xml dump of wikipedia, but the links don't
work. They lead to files that are substantially smaller than expected (in this
case less than 1k). Can someone fix the link, or redump the standard xml file or
point to an older one (nothing is available)

Thanks,

Vasco


Version: unspecified
Severity: normal

Details

Reference
bz9824

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:42 PM
bzimport set Reference to bz9824.
bzimport added a subscriber: Unknown Object (MLST).

broken.arrow wrote:

Can you provide an example of a bad file? I have tried downloading one of the
latest completed dumps from http://download.wikimedia.org/ which are not labeled
as "failed" and the size is as expected, the MD5 checksum is correct, and the
file imports fine. This is the file I tested right now:

http://download.wikimedia.org/viwiki/20070507/viwiki-20070507-pages-articles.xml.bz2

vasco wrote:

http://download.wikimedia.org/enwiki/20070402/enwiki-20070402-pages-articles.xml.bz2

This is the backup for the english wiki, with current versions of article
content. The file says 2.3GB, but when I download it I get a file with less than 1k.

Vasco

broken.arrow wrote:

Downloaded the file, verified correct size (2,501,101,055 bytes) and MD5
checksum (2C1BB7C94F4F0695EFE8670940C12D54). Closing for now, feel free to
reopen if you can replicate it from a different environment.