Page MenuHomePhabricator

Refactor transfer.py
Closed, ResolvedPublic

Description

The transfer.py file which is used for database backup and recovery has become a bit lengthy to read and understand. This ticket is to discuss the possibility of modularisation. Modularising it to multiple logical fragments could reduce the length of the file.

Event Timeline

I think a couple of ideas to start about this:

  • Move open_firewall and close_firewall to its own Firewall class with open and close methods
  • Move start_slave and stop_slave to its own MariaDB class with start_replication and stop replication methods

I think transfering, handling mariadb and handling the firewall as logically separate units that should be also separate on its own files. That will make the files smaller instead of a monolithic unit. What do you think?

For longer term, we should move transfer.py and all its dependencies to its own separate directory, outside of wmfmariadpy.

@Privacybatm: Where / in which repository is that file located? Context welcome in task descriptions. :)

@Privacybatm: Where / in which repository is that file located? Context welcome in task descriptions. :)

@Aklapper Thank you for pointing that out. I will keep that in mind in the future bug reports :)

I think transfering, handling mariadb and handling the firewall as logically separate units that should be also separate on its own files. That will make the files smaller instead of a monolithic unit. What do you think?

It makes perfect sense to me.

Should we consider further modularisation by moving the commands to another module (like diskaRelatedCommands, compressionRelatedCommands ..etc?), will that be overdoing?

Anyway, let me start with the idea of 3 modules transfer, firewallHandler and mariadbHandler.

Change 595909 had a related patch set uploaded (by Privacybatm; owner: Privacybatm):
[operations/software/wmfmariadbpy@master] transfer.py: Move MariaDB and Firewall logic to its new handler modules

https://gerrit.wikimedia.org/r/595909

Change 595909 merged by Jcrespo:
[operations/software/wmfmariadbpy@master] transfer.py: Move Transferer, MariaDB logic and Firewall logic to its new module files

https://gerrit.wikimedia.org/r/595909

jcrespo triaged this task as Medium priority.May 14 2020, 9:50 AM

Great job here! I looked at every line of the change, and tested it on several runs and it worked nicely. This change, I think, will make further development much easier. You did a lot of work on refactoring- I liked the way you resolved the dependency inversion on the subclasses. I merged as is.

There are some things that I would like you to address before fully closing this ticket:

  • There were non-explicit dependencies still on wmfmariadbpy that should be moved under the RemoteExecution directory:
ParamikoExecution.py
SSHExecution.py
SaltExecution.py
LocalExecution.py

This are all unsupported methods (alternative implementations of the parent "RemoteExecution"), but that in theory work, and could be useful for local testing beyond the one we are interested, which is Cumin. Even if we don't work on them or some don't work well, we should keep them as part of the transfer.py framework. :-D

  • The other thing is that I think the Firewall class should store as an instance property, at the very least the target_host on the constructor. While there could be an argument to keep the port as a parameters so we can open and close several of them with a single instance, we should not be able to close ports that we hadn't open. I think you can keep the port as a parameters, but store the target host on creation- a new firewall instance should be needed for each host managed, so some kind of "state" is kept. We could think of changing the name if you don't like it, or setup a destructor for it, but I don't think that is necessary at the time.

Good job, we are almost ready to close your first ticket :-D.

Let me know if you agree with these suggestions, or have alternative suggestions- I am very open to other options, nothing is set in stone :-D.

There are some things that I would like you to address before fully closing this ticket:

  • There were non-explicit dependencies still on wmfmariadbpy that should be moved under the RemoteExecution directory:
ParamikoExecution.py
SSHExecution.py
SaltExecution.py
LocalExecution.py

I will copy this files to the RemoteExecution directory.

  • The other thing is that I think the Firewall class should store as an instance property, at the very least the target_host on the constructor. While there could be an argument to keep the port as a parameters so we can open and close several of them with a single instance, we should not be able to close ports that we hadn't open. I think you can keep the port as a parameters, but store the target host on creation- a new firewall instance should be needed for each host managed, so some kind of "state" is kept. We could think of changing the name if you don't like it, or setup a destructor for it, but I don't think that is necessary at the time.

I don't know I understood this correctly.
I understand it like store a target_host and a list of ports open (as and when called with ports) in the firewall class and close all the ports when the close() function get called. Also, delete the list at destructor. Am I right?

Good job, we are almost ready to close your first ticket :-D.

Thank you :-)

I don't know I understood this correctly.

Basically, I suggest to modify:

def __init__(self, remote_execution)                   ~~~~~>   def __init__(self, target_host, remote_execution): + self.target_host = target_host

def run_command(self, host, command)                   ~~~~~>   def run_command(self, command)
def open(self, source_host, target_host, target_port): ~~~~~>   def open(self, source_host, target_port):

etc, storing target_host in creation, and not being able to select it afterwards.

Create as many Firewall instances as hosts needed.

Change 596464 had a related patch set uploaded (by Privacybatm; owner: Privacybatm):
[operations/software/wmfmariadbpy@master] Firewall.py: Store target_host as an instance property of Firewall object

https://gerrit.wikimedia.org/r/596464

Not directly related to refactoring, but I though this was very interesting for you in general:

I talked to Riccardo (cumin maintainer), and he mention he has plans to fix the configurability of the output at T212783 (The feature requests already exists).

He mentioned that, for now, this is how they suppress output on applications that use cumin, in case that helps for further cases:
https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/spicerack/+/master/spicerack/remote.py#632

We can use that with a TODO pointing to the task that we will change it once T212783 is resolved.

I would certainly support a ticket regarding improving output messages :-D.

Glad to hear about the cumin ticket :D
I have made a new ticket regarding output message improvement here: T252802.
Thank you for supporting this issue :-)

Change 596464 merged by Jcrespo:
[operations/software/wmfmariadbpy@master] Firewall.py: Store target_host as an instance property of Firewall object

https://gerrit.wikimedia.org/r/596464

Thank you for merging! :-)