Page MenuHomePhabricator

Framework to transfer files over the LAN
Closed, ResolvedPublic

Description

Jaime already started to work on a framework to transfer files (in the DB context: to clone databases):

https://gerrit.wikimedia.org/r/#/c/326155/

The transfer method must:

* Be as fast as the network bandwidth allows it, with
  configurable throttling
* Be easy to use (take automatic decisions when safe)
* Allow for both single files and entire directories to be
  synced. The directories can be of thousands of small files
* Keep the original permissions and ownership
* Have in-place checks to avoid doing damaging stuff
* Allow encryption
* Allow configurable compression
* Allow configurable resource taking (e.g. number of CPUs)
* Checksum contents before and after the copy to check it has been
  done successfuly
* Allow multicast-like transfers from 1 server to many
* Report the status at any time, and if it fails, why
* Handle the firewall automatically
* Not require a constantly open port or service

The current code is just the barebones, it has to be integrated
with volan's packages for remote code execution.

Because the above, rsync is not enough. We have to give a look to
multi-thread FTP, tar + socat with user encryption, and bittorent.

Event Timeline

@Rduran Do you think you can take care of this? There is a prototype at https://gerrit.wikimedia.org/r/326155 but all the other Remote Calling methods should be dropped and use instead cumin ( https://wikitech.wikimedia.org/wiki/Cumin ). Sadly, Cumin is python2 only for now.

@Vgutierrez suggested using https://github.com/vstakhov/hpenc , which I don't think is a bad idea at all- it would just change some of the executions of openssl and netcat to that tool, but the tools probably needs packaging and setup?

The recommended cipher, which is an easier change, is chacha20 or, alternatively, AES-GCM rather than the randomly selected one on the commit.

hpenc looks interesting, so maybe we can keep it in mind for future improvements.

Yes, I was not suggesting to do it now, just document the suggestion for the future- or maybe they can even set it up for us in parallel. Changing the algorithm, assuming openssl on stretch supports it, though, should be a 10 character patch.

in stretch chacha20 is available as "chacha20" and in jessie as "chacha20-poly1305", BTW for big enough block size (16384 bytes), chacha20 performs better than rc4 on one core :)

Thank you both! I'm using "chacha20" right now and it seems to work just fine (I'm using buster, but stretch is also on 1.1.0). Does jessie need to be supported too?

No, stick to stretch, that is ok- that is the target.

Change 432569 had a related patch set uploaded (by Rduran; owner: Rduran):
[operations/puppet@production] [WIP] Refactor code in transfer.py

https://gerrit.wikimedia.org/r/432569

Change 433558 had a related patch set uploaded (by Rduran; owner: Rduran):
[operations/software/wmfmariadbpy@master] [WIP] Refactor code in transfer.py

https://gerrit.wikimedia.org/r/433558

Change 432569 abandoned by Rduran:
[WIP] Refactor code in transfer.py

Reason:
Moved to operations/software/wmfmariadbpy

https://gerrit.wikimedia.org/r/432569

Change 433558 merged by Jcrespo:
[operations/software/wmfmariadbpy@master] [WIP] Refactor code in transfer.py

https://gerrit.wikimedia.org/r/433558

Change 446871 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/software/wmfmariadbpy@master] transfer.py: Make checksum optional

https://gerrit.wikimedia.org/r/446871

Change 446871 merged by Jcrespo:
[operations/software/wmfmariadbpy@master] transfer.py: Make checksum optional

https://gerrit.wikimedia.org/r/446871

@jcrespo how do you feel about closing this task as resolved?

The original scope isn't met by far:

  • No throttling except it is easy to implement with pv
  • It is not intelligent
  • Compression is only on/off, not configurable
  • CPU resources is not configurable
  • Checksum works but it is very slow (serial execution before and after)
  • No multicast
  • No state reporting

Additionally, I would like to see:

  • More work towards mysql provisioning
  • Configurable compression and decompression on both ends (transmit a packaged file or create one)

We can create a separate ticket for mysql provisioning automation, I think, as part of the binary backups tasks, to do this.

transfer.py was modified to add hot mysql backup taking and compression/decompression handling for provisioning.

It is still a bit of a clunky mess, and it would be nice to be adopted by someone else to maintain the basic transfer functionality better.

Change 608053 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb-backups: Move transferpy deployment to debian package

https://gerrit.wikimedia.org/r/608053

Change 608053 merged by Jcrespo:
[operations/puppet@production] mariadb-backups: Move transferpy deployment to debian package

https://gerrit.wikimedia.org/r/c/operations/puppet/ /608053

BBlack subscribed.

The swap of Traffic for Traffic-Icebox in this ticket's set of tags was based on a bulk action for all such tickets that haven't been updated in 6 months or more. This does not imply any human judgement about the validity or importance of the task, and is simply the first step in a larger task cleanup effort. Further manual triage and/or requests for updates will happen this month for all such tickets. For more detail, have a look at the extended explanation on the main page of Traffic-Icebox . Thank you!

@jcrespo thoughts on closing this? I don't think there's much point on keeping this open, especially as generic as it is now.

jcrespo claimed this task.

Yeah, there are already tickets open for the pending issues (logs, arguments, etc.).

Adding Arnaud so he can add this ticket to his reading list, for context of previous conversations about transference tooling needs- but requiring nothing else.