Page MenuHomePhabricator

More optimal download of large files / zsync / rsync
Closed, DeclinedPublic

Description

Download files are large, we all know it. Downloading them repeatedly sucks
[bandwidth].

First I wanted to ask about rsync, whether you would enable it. It is the
widespread way to save bits on the wire. It could help, especially if you
compress files with 'gzip --rsyncable' (seems to be supported by newer gzips, at
least on Debian). Saves a lot, minimal average told to be 10%.

Problem is that rsync requires resources on the server side, mainly for creating
the delta data, and maybe download.wikimedia.org doesn't possess surplus
resources. (The larger the file, the more resource it consumes, I believe.)

Then it struck me that I've seen something with the gains of rsync without its
problems: zsync. Google for zsync and feel lucky :). It is part of Debian, and
maybe other distros.

Features:

  • _no_ server or shell required
  • uses statically generated delta data, which resides in a single, plain file
  • works on any HTTP/1.1 webserver (uses partial transfers)
  • sexy (eg. saves plenty of bandwidth)

Basically if you would be so mighty kind as to support zsync, the only thing you
would have to do is to generate .zsync files for the downloadable stuff
(zsyncmake -u http://download.wikimedia.org/wikipedia/hu/cur_table.sql.gz
cur_table.sql.gz) and put it next to the file. That's all, the rest is the
clients' business.

zsync files seem to be 1/300 of the size of the original, and generating took 5
sec for 100MB file on my average machine.

Pretty Please with Sugar on the Top, Cream and Cherries? (<-- quote from The
Monkey Island :))

Thanks.
[[:hu:user:grin]]


Version: unspecified
Severity: enhancement

Details

Reference
bz2923

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 8:42 PM
bzimport set Reference to bz2923.
bzimport added a subscriber: Unknown Object (MLST).

Since the biggest of our dump files are compressed with things much more serious than gzip and aren't friendly to rsync/zsync, gonna WONTFIX this. :(

Changing all WONTFIX high priority bugs to lowest priority (no mail should be generated since I turned it off for this.)