Page MenuHomePhabricator

cumin should allow operators to show backtraces on errors
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:

cumin fails with the error Caught AttributeError exception: type object 'GreenSocket' has no attribute 'sendmsg' without any explanation of *where* the error is coming from.

What should have happened instead?:

cumin should provide us a way to show a backtrace so we can figure out what the actual error is.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

found in 4.2.0 and 5.0.0

Other information (browser name/version, screenshots, etc.):

this happens on debian trixie.

i would suggest this patch in gerrit:

@@ -481,9 +481,16 @@ def main(argv=None):
         stderr('Execution interrupted by Ctrl+c/SIGINT/Aborted')
         exit_code = 98
     except Exception as e:  # pylint: disable=broad-except
+        exit_code = 99
         stderr('Caught {name} exception: {msg}'.format(name=e.__class__.__name__, msg=e))
+        if args.trace:
+            logger.warning('starting debugger, type "c" and enter to continue')
+            import pdb
+            import traceback
+
+            traceback.print_exc()
+            pdb.post_mortem()
         logger.exception('Failed to execute')
-        exit_code = 99
 
     return exit_code

but (a) the cumin contribution docs say we should open an issue here and (b) i don't remember how do actually send patches in gerrit (but i'll learn if i get a go here).

Details

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Note that handling the exception itself is another question: the backend importer already has code to ignore import errors, but in this case the exception raised is not an ImportError, it's an AttributeError. maaaybe the code in grammar.py (_import_backend) could be changed to handle generic Exceptions instead... but i'm not sure that should be filed as a bug...

@TheAnarcat I agree the way the error was hidden is not ideal, but the stacktrace should have been logged into the cumin's logfile.
Was this not the case?

As for the error itself I'm still unable to reproduce it. The error itself seems to come from python3-eventlet that defines GreenSocket and declares it 100% compatible with socket.socket. And it's triggered when performing import trio of python3-trio.

A clean installation of cumin even with the suggested packages for Openstack support doesn't seem to include any of those dependencies, so I'm wondering how you're ending up with them.

I even tried to repro the error manually installing python3-eventlet and python3-trio with no luck on a docker instance and a Debian bookworm VM.
Could you share a bit more of the setup that allows you to reproduce the bug? In particular what installs trio and eventlet.
Could you share a bit more

@TheAnarcat  I agree the way the error was hidden is not ideal, but the stacktrace should have been logged into the cumin's logfile.
Was this not the case?

TIL: there's a logfile?

As for the error itself I'm still unable to reproduce it. The error itself seems to come from `python3-eventlet` that defines `GreenSocket` and declares it 100% compatible with `socket.socket`. And it's triggered when performing `import trio` of `python3-trio`.

A clean installation of cumin even with the suggested packages for Openstack support doesn't seem to include any of those dependencies, so I'm wondering how you're ending up with them.

I installed them for another purpose: openstack-clients, which depends
on python3-keystoneauth1.

I even tried to repro the error manually installing `python3-eventlet` and `python3-trio` with no luck on a docker instance and a Debian bookworm VM. 
Could you share a bit more of the setup that allows you to reproduce the bug? In particular what installs trio and eventlet.

I think I answered that above, but try:

apt install python3-keystoneauth1
python3 -c 'from keystoneauth1 import session'

It should also trigger from cumin at this point.

Could you share a bit more

This was cut off...

@TheAnarcat  I agree the way the error was hidden is not ideal, but the stacktrace should have been logged into the cumin's logfile.
Was this not the case?

TIL: there's a logfile?

Yeah, ofc, see the related documentation, it's also mentioned in the help options:

-d, --debug           Set log level to DEBUG. See also log_file in the configuration. [default: False]
--trace               Set log level to TRACE, a custom logging level intended for development debugging. See also log_file in the configuration. [default:
                      False]

Is also a required config, cumin bails out if not set, so I'm pretty sure you have it :)

I installed them for another purpose: openstack-clients, which depends
on python3-keystoneauth1.

I think I answered that above, but try:

apt install python3-keystoneauth1
python3 -c 'from keystoneauth1 import session'

So it's not that trivial, to repro I've tried on both sid and trixie with:

docker run -it --platform linux/amd64 amd64/debian:sid "/bin/bash"
docker run -it --platform linux/amd64 amd64/debian:trixie "/bin/bash"

The run on both:

apt update
apt dist-upgrade
apt install python3-keystoneauth1

No error so far with:

python3 -c 'from keystoneauth1 import session'

But neither python3-eventlet nor python3-trio were installed at all at this time. Hence I tried to install both of them manually:

apt install python3-eventlet python3-trio

Still no luck. I then installed openstack-clients:

apt install openstack-clients

That brings dependencies for ~1GB of disk space. And now I can repro it on both sid and trixie.

So I think that the main issue is that python3-eventlet declares GreenSocket to be 100% compatible with socket.socket but apparently it isn't.
And it gets triggered only when some additional dependency brought in by openstack-clients makes trio and eventlet hit that code path.

>>   @TheAnarcat  I agree the way the error was hidden is not ideal, but the stacktrace should have been logged into the cumin's logfile.
>>   Was this not the case?
>
> TIL: there's a logfile?

Yeah, ofc, see the related documentation <https://doc.wikimedia.org/cumin/master/configuration.html>, it's also mentioned in the help options:

  -d, --debug           Set log level to DEBUG. See also log_file in the configuration. [default: False]
  --trace               Set log level to TRACE, a custom logging level intended for development debugging. See also log_file in the configuration. [default:
                        False]

Is also a required config, cumin bails out if not set, so I'm pretty sure you have it :)

Ha.

It's been a while since I set that up:

anarcat@angela:~> git blame ~/.config/cumin/config.yaml | grep log_file
10a90b1bb (Antoine Beaupré 2019-05-03 21:19:53 -0400 7) log_file: /dev/null

... oops! :)

Anyways: seems to me it would be nice to also show the dump on the
terminal, as it's not obvious to me the backtrace would logged at all
(nor why it would *only* be logged...)

[...]

So I think that the main issue is that `python3-eventlet` declares `GreenSocket` to be 100% compatible with `socket.socket` but apparently it isn't.
And it gets triggered only when some additional dependency brought in by openstack-clients makes trio and eventlet hit that code path.

Yeap, I think you're right that I misplaced that (debian) bug report,
i'll see if i can move that around, but it's not clear to me which
package to assign this on since, as you say, just eventlet is not
sufficient to trigger it.

I was kind of hoping the openstack maintainers (e.g. zigo) would carry
this forward, as all of those packages are under that same umbrella
anyways...

But that's kind of off-topic here...

Volans triaged this task as Medium priority.Jan 27 2025, 4:41 PM

Now that the original bug is clearly unrelated to cumin, let's go back to the original topic of this task.

I agree for the final exception handling we should show the traceback to the user but I would just print it to stderr. Dropping into a debug console seems a bit too much for a CLI tool that can be used both interactively and not interactively.

If you want to send a patch along those lines it would be appreciated. You can follow the related docs here: https://wikitech.wikimedia.org/wiki/Cumin#Contributing_to_the_code

i tried to push a branch to gerrit but failed:

anarcat@angela:~/d/cumin[128]> git push --set-upstream origin backtrace-T384539
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 16 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 625 bytes | 625.00 KiB/s, done.
Total 4 (delta 3), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (3/3)
remote: error: branch refs/heads/backtrace-T384539:
remote: You need 'Create' rights to create new references.
remote: User: anarcat
remote: Contact an administrator to fix the permissions
remote: Processing changes: refs: 1, done    
To ssh://gerrit.wikimedia.org:29418/operations/software/cumin.git
 ! [remote rejected] backtrace-T384539 -> backtrace-T384539 (prohibited by Gerrit: not permitted: create)
error: failed to push some refs to 'ssh://gerrit.wikimedia.org:29418/operations/software/cumin.git'

i tried to do a specific push:

anarcat@angela:~/d/cumin> git push origin HEAD:refs/for/backtrace-T384539
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 16 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 625 bytes | 625.00 KiB/s, done.
Total 4 (delta 3), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (3/3)
remote: Processing changes: refs: 1, done    
To ssh://gerrit.wikimedia.org:29418/operations/software/cumin.git
 ! [remote rejected] HEAD -> refs/for/backtrace-T384539 (branch backtrace-T384539 not found)
error: failed to push some refs to 'ssh://gerrit.wikimedia.org:29418/operations/software/cumin.git'

maybe i'm using gerrit wrong.

this is my current patch, FWIW

diff
From bdca62207407e0253ca5adb3cea7dd295b9ed70b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Antoine=20Beaupr=C3=A9?= <anarcat@debian.org>
Date: Mon, 27 Jan 2025 13:58:55 -0500
Subject: [PATCH] dump backtrace on exception, on --trace

Bug: T384539
---
 cumin/cli.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/cumin/cli.py b/cumin/cli.py
index d43eade..1750faa 100644
--- a/cumin/cli.py
+++ b/cumin/cli.py
@@ -482,6 +482,10 @@ def main(argv=None):
         exit_code = 98
     except Exception as e:  # pylint: disable=broad-except
         stderr('Caught {name} exception: {msg}'.format(name=e.__class__.__name__, msg=e))
+        if args.trace:
+            import traceback
+
+            traceback.print_exc()
         logger.exception('Failed to execute')
         exit_code = 99
 
-- 
2.45.2

You should not push a branch to Gerrit, I usually use git review but you can also push directly to HEAD:refs/for/backtrace-T384539 (see https://gerrit.wikimedia.org/r/Documentation/user-upload.html#_git_push )

As for the change, you can also get the same result without the need to import traceback, just adding a handler to the logger, given that is the last executed line and cumin exists two lines after. Like:

if args.trace:
    logger.addHandler(logging.lastResort)  # Add a stream handler to log to stderr the exception

Change #1114456 had a related patch set uploaded (by TheAnarcat; author: TheAnarcat):

[operations/software/cumin@master] dump backtrace on exception, on --trace

https://gerrit.wikimedia.org/r/1114456

You should not push a branch to Gerrit, I usually use git review but you can also push directly to HEAD:refs/for/backtrace-T384539 (see https://gerrit.wikimedia.org/r/Documentation/user-upload.html#_git_push )

But I *did* try that, and i read exactly that document. That's the second error in my comment above:

! [remote rejected] HEAD -> refs/for/backtrace-T384539 (branch backtrace-T384539 not found)

Anyways. I had forgotten about git review, it looks like that worked and made https://gerrit.wikimedia.org/r/c/operations/software/cumin/+/1114456

As for the change, you can also get the same result without the need to import traceback, just adding a handler to the logger, given that is the last executed line and cumin exists two lines after. Like:

if args.trace:
    logger.addHandler(logging.lastResort)  # Add a stream handler to log to stderr the exception

I'm not sure I quite get that: isn't that the same as the already existing logging handlers, and why do we go through the logging module here if we are dumping stuff with stderr() above?

Then again, I never quite got the lastResort stuff in the first place, so I feel a little out of my depth there.

All those considered, maybe it's better if someone else authors that rather trivial change so that it's just so the way you want it. There are many different ways to implement this and I don't want to waste anyone's time discussing such specifics. :)

Change #1114456 merged by jenkins-bot:

[operations/software/cumin@master] cli: log an eventual exception to stderr

https://gerrit.wikimedia.org/r/1114456

Volans claimed this task.

This has been released with cumin v5.1.0