Page MenuHomePhabricator

Import issue (bug?) on Python 3.4/3.5 + multiprocessing affecting Cumin
Closed, DeclinedPublic

Description

When running two instances of cumin on the same process, execute() fails with:

1Traceback (most recent call last):
2 File "/usr/lib/python3/dist-packages/cumin/grammar.py", line 132, in _import_backend
3 keyword = backend.GRAMMAR_PREFIX
4AttributeError: module 'cumin.backends.openstack' has no attribute 'GRAMMAR_PREFIX'
5
6During handling of the above exception, another exception occurred:
7
8Traceback (most recent call last):
9 File "./daily_snapshot.py", line 359, in <module>
10 main()
11 File "./daily_snapshot.py", line 355, in main
12 sys.exit(result[max(result, key=lambda key: result[key].get())].get())
13 File "./daily_snapshot.py", line 355, in <lambda>
14 sys.exit(result[max(result, key=lambda key: result[key].get())].get())
15 File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
16 raise self._value
17 File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
18 result = (True, func(*args, **kwds))
19 File "./daily_snapshot.py", line 334, in run
20 result = run_transfer(section, config, port)
21 File "./daily_snapshot.py", line 299, in run_transfer
22 (returncode, out, err) = execute_remotely(config['destination'], cmd)
23 File "./daily_snapshot.py", line 285, in execute_remotely
24 result = remote_executor.run(host, local_command)
25 File "./daily_snapshot.py", line 110, in run
26 hosts = query.Query(self.config).execute(host)
27 File "/usr/lib/python3/dist-packages/cumin/query.py", line 35, in __init__
28 self.registered_backends = grammar.get_registered_backends(external=external)
29 File "/usr/lib/python3/dist-packages/cumin/grammar.py", line 43, in get_registered_backends
30 keyword, backend = _import_backend(name, available_backends)
31 File "/usr/lib/python3/dist-packages/cumin/grammar.py", line 134, in _import_backend
32 raise CuminError('{message}: GRAMMAR_PREFIX module attribute not found'.format(message=message))
33cumin.CuminError: Unable to register backend 'openstack' in module 'cumin.backends.openstack': GRAMMAR_PREFIX module attribute not found

You would say, "why running 2 instances of cumin? Cumin will handle the concurrency for you!", but that is not that simple, I have ongoing tasks that at some point they run a single cumin job to a single host, I cannot sync when that will happen (that is why I have a pool of workers so they get processed as they need).

Maybe it is how I use cumin, although it is still an exception in an AttributeError, and it works when I run in parallelism = 1, and it is due to my use or misuse, but I would say it is trying to manually import a module twice, maybe?

Code is at https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/494899/18/modules/profile/files/mariadb/daily_snapshot.py

Should I Exec cumin command line to make it work?

Event Timeline

This is quite weird an require some more in-depth analysis unfortunately.

The openstack backend is an optional backend, that works only if all the required dependencies are installed. On prod cumin hosts those dependencies are not installed and the import fails here [1], the backend is not registered and cumin continues to import the other backends.

I was able to repro the issue with this:

import importlib
from multiprocessing.pool import ThreadPool
from cumin import query  # Without this the issue doesn't repro
# The line above does:
# from cumin.backends import BaseQuery, BaseQueryAggregator, InvalidQueryError
# that are imported from cumin/backends/__init__.py

def f():
    try:
        return importlib.import_module('cumin.backends.openstack')
    except ImportError:
        return False

r = []
a = ThreadPool(3)
for i in range(5):
    r.append(a.apply_async(f))

a.close()
a.join()

for i in r:
    print(i.get())

That prints:

False
<module 'cumin.backends.openstack' from '/usr/lib/python3/dist-packages/cumin/backends/openstack.py'>
<module 'cumin.backends.openstack' from '/usr/lib/python3/dist-packages/cumin/backends/openstack.py'>
False
False

I've also quickly inspected the loaded module and of course is not a fully loaded module, as it cannot import keystoneauth1 (see [2]) because the dependencies are not installed, but apparently doesn't raise exception as it should, because in that case it will be catched.
The loaded module has the pp property that is imported on line 2, but misses the rest of the module, including GRAMMAR_PREFIX on line 241 that is where your code fails.
I've also quickly tried to add a call to importlib.invalidate_caches() but nothing changed.

[1] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/cumin/+/refs/heads/master/cumin/grammar.py#121
[2] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/cumin/+/refs/heads/master/cumin/backends/openstack.py

Ok, I was able to repro without any cumin involvment, I've created the following structure:

$ tree repro/
repro/
├── fail.py
└── __init__.py

With the following content.

  • __init__.py:
def init():
    pass
  • fail.py
raise ImportError

And modified the script above to:

import importlib
from multiprocessing.pool import ThreadPool
from repro import init  # without this it doesn't repro

def f():
    try:
        return importlib.import_module('repro.fail')
    except ImportError:
        return False

r = []
a = ThreadPool(3)
for i in range(5):
    r.append(a.apply_async(f))

a.close()
a.join()

for i in r:
    print(i.get())

And I get different results running it multiple times:

$ python3 test.py
False
<module 'repro.fail' from '/home/volans/repro/fail.py'>
<module 'repro.fail' from '/home/volans/repro/fail.py'>
False
False
$ python3 test.py
False
<module 'repro.fail' from '/home/volans/repro/fail.py'>
False
False
False

So, I was able to repro this on Python 3.4 and 3.5 but not on 3.6 and 3.7 where it works like a charm.

@Vgutierrez suggested to try adding a call to importlib.reload() when it half-loads the module.
Magically it makes it fail, although not with what you would expect (raising a pure ImportError) but with:

ImportError: module repro.fail not in sys.modules

I can make a patch to Cumin to add the reload as a workaround for Python 3.4/3.5.
Thanks a lot @Vgutierrez for the suggestion!

jcrespo renamed this task from cumin not thread/multiprocess-safe ? to Import issue (bug?) on Python 3.4/3.5 + multiprocessing affecting Cumin.Mar 13 2019, 12:29 PM

So, I tried upstream opening https://bugs.python.org/issue36284 but it got closed because 3.4 and 3.5 are security fix only at this point.
I'll look into adding a workaround into Cumin itself but is not super trivial because the reload() does mess up a bit with existing things.

@jcrespo the need of threads (vs async or multiprocess) is a requirement?

I am open to alternative suggestions that do not crash, please suggest a different model that allows for multiple threads or processes running at the same time while running cumin and do not do a busy loop. :-) Process or threads in this case don't matter as the high level script uses no memory and doesn't have to be responsive.

I get lost with the many deprecated parallel processing and concurrency libraries, and the newer async model didn't work for me because it needed 3.6+ (or I may be too stupid to implement it on 3.4/3.5).

I think it may be me after all, look at the documentation at https://docs.python.org/3.4/library/multiprocessing.html

Functionality within this package requires that the __main__ module be importable by the children.
This means that some examples, such as the multiprocessing.pool.Pool examples will not work in the interactive interpreter. For example:

AttributeError: 'module' object has no attribute 'f'

I migrated to proceses instead of threads to workaround the issue. I don't think it is worth keeping this open as by the time someone else has the same issue, we would have a more modern python stack.