Page MenuHomePhabricator

transferpy should not log cumin subcomands as ERRORs on a normal, succesful run
Open, MediumPublic

Description

A number of transferpy's cleanup actions (at least) run commands where the return code isn't important. The impact of this is that even on a successful transfer the log file contains a lot of ERROR and CRITICAL entries which make debugging anything else a problem because of noise - see P44914 for an example.

transferpy should use cumin's ok_codes for commands like these instead.

Thanks :)

Event Timeline

jcrespo changed the task status from Open to In Progress.Mar 21 2023, 8:27 AM
jcrespo claimed this task.

@MatthewVernon Sadly I am unable to help you- transfer.py currently uses the wmfmariadbpy.RemoteExecution library, and that functionality is not currently available there. While I wrote wmfmariadbpy.RemoteExecution abstraction this was taken away from transfer.py/backups maintanance/responsability and integrated into wmfmariadbpy, which I don't maintain anymore (@Kormat does or did). While I think RemoteExecution should be part of transfer.py this was not my decision to make.

While I have the technical ability to do the changes and they are easy, I cannot touch the wmfbackups library. You should talk to the DBAs or our manager to see how to proceed next.

jcrespo changed the task status from In Progress to Open.Mar 21 2023, 9:03 AM
jcrespo removed jcrespo as the assignee of this task.
jcrespo subscribed.

Hi, I was asked to review the coupling between transfer.py and wmfmariadbpy and my decision is: Copy the RemoteExecution pieces that transfer.py needs and simply drop dependency on wmfmariadbpy. I hope that unblocks this work. Thanks and let me know if you need anything more.

Change 972475 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/software/transferpy@master] RemoteExecution: Restore RemoteExecution class back into transfer.py

https://gerrit.wikimedia.org/r/972475

Change 972729 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/software/transferpy@master] RemoteExecution: Remove errors from cumin logging from basic tests

https://gerrit.wikimedia.org/r/972729

If you're not interested in the report of the results of the commands you can set the worker.reporter property to the NullReporter (
from cumin.transports.clustershell import NullReporter).
If you're not interested in the progress bars being printed for some commands you can set worker.progress_bars = False.
Here a full example:

>>> import logging
>>>
>>> import cumin
>>>
>>> from cumin import query, transport, transports
>>> from cumin.transports.clustershell import NullReporter
>>>
>>> logging.basicConfig(level=logging.INFO)
>>> config = cumin.Config()
>>>
>>> hosts = query.Query(config).execute('A:cumin')
>>> target = transports.Target(hosts)
>>> worker = transport.Transport.new(config, target)
>>> worker.commands = [transports.Command('test -d /nonexistent')]
>>> worker.handler = 'async'
>>> worker.reporter = NullReporter
>>> worker.progress_bars = False
>>> exit_code = worker.execute()
INFO:cumin.transports.clustershell.ClusterShellWorker:Executing commands [cumin.transports.Command('test -d /nonexistent')] on '2' hosts: cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet
>>> exit_code
2
>>> worker.commands = [transports.Command('test -d /tmp')]
>>> exit_code = worker.execute()
INFO:cumin.transports.clustershell.ClusterShellWorker:Executing commands [cumin.transports.Command('test -d /tmp')] on '2' hosts: cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet
>>> exit_code
0

See also https://doc.wikimedia.org/cumin/master/introduction.html#examples (the last example).

jcrespo renamed this task from transferpy should take advantage of cumin's ok_codes to avoid spurious ERRORs to transferpy should not log cumin subcomands as ERRORs on a normal, succesful run.Nov 8 2023, 3:29 PM

Thank you a lot, worker.reporter was exactly what I think @MatthewVernon wanted (ignoring logging by the executioner and roll our own logging of what was an error), not ok_codes. I've updated the ticket to reflect more accurately the request.

Change 974159 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/software/transferpy@master] RemoteExecution: Remove cumin logged errors from low level execution

https://gerrit.wikimedia.org/r/974159

jcrespo triaged this task as Medium priority.