Page MenuHomePhabricator

Stretch grid problem: Please install packages libmariadbclient-dev-compat and libssl-dev
Closed, ResolvedPublic

Description

While migrating giftbot to the Stretch job grid I ran into this problem:

/data/project/shared/tcl/bin/tclsh8.7:
%package require mysqltcl couldn't load file "/data/project/shared/tcl/lib/mysqltcl-3.052/libmysqltcl3.052.so": libmysqlclient.so.18: cannot open shared object file: No such file or directory
%package require tls
couldn't load file "/mnt/nfs/labstore-secondary-tools-project/.shared/tcl/lib/tls1.6.4/libtls1.6.4.so": libssl.so.1.0.0: cannot open shared object file: No such file or directory

Those aren't the exact versions that the packages provide but I will recompile my libraries and then everything matches (I already tested this on a stretch VPS instance).

Event Timeline

Change 484151 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] toolforge: Add dev packages for giftbot

https://gerrit.wikimedia.org/r/484151

Change 484151 abandoned by BryanDavis:
toolforge: Add dev packages for giftbot

https://gerrit.wikimedia.org/r/484151

@Giftpflanze both of the packages you are looking for seem to be installed on login-stretch.tools.wmflabs.org. We actually have libmariadb-dev-compat installed rather than libmariadbclient-dev-compat, but the installed package seems to be a superset of the other. I'm wondering if the main problem is that we do not have these dev packages installed on the grid so that a compile job can be submitted?

I'm hitting this well:

Traceback (most recent call last):
  File "/data/project/yifeibot/o/toolserver/bryan/flickr/bots/flickreviewr.py", line 35, in <module>
    from botbase import FlickrBotBase
  File "/mnt/nfs/labstore-secondary-tools-project/yifeibot/o/toolserver/bryan/flickr/bots/botbase.py", line 37, in <module>
    import database
  File "/data/project/yifeibot/o/toolserver/bryan/flickr/shared/database.py", line 1, in <module>
    import MySQLdb
  File "/data/project/yifeibot/.local/local/lib/python2.7/site-packages/MySQLdb/__init__.py", line 19, in <module>
    import _mysql
ImportError: libmariadb.so.2: cannot open shared object file: No such file or directory

This libmariadb.so.2 is provided by libmariadb2, which has a dependency chain default-libmysqlclient-dev -> libmariadbclient-dev-compat -> libmariadb-dev-compat -> libmariadb-dev -> libmariadb2. Yes this error only happens on grid exec nodes; the bastion has the package installed for idek why.

I'm hitting this well

Did you rebuild your /data/project/yifeibot/.local/local/lib/python2.7 packages on a Stretch host yet? I'm wondering if that error is just because the library versions have changed or if it is not possible to install the MySQLdb into a virtualenv right now on a Stretch grid host.

the bastion has the package installed for idek why.

In the Puppet configuration for both the old and the new grid deployment the bastions have a manifest applied that provides "dev" packages. It looks like this may have started as installing editors and other interactive cli packages. It also includes a lot of "-dev" packages for libraries however which I think we should at minimum be exposing to the 'task' grid job queue so that programs can be compiled via grid job submission rather than only directly on the bastions. It looks like these '-dev' back compat packages for mariadb may need to be on all nodes.

Did you rebuild your /data/project/yifeibot/.local/local/lib/python2.7 packages on a Stretch host yet?

Yes. They are built on the new sgebastion-06 with pip & wheel AFAICT.

dpkg -S points to libmariadb2 and checking apt-cache policy the package is only installed on the bastion.

Yes. They are built on the new sgebastion-06 with pip & wheel AFAICT.

Thanks. That helps me reason about this better. I think that we probably need to be installing libmariadb-dev-compat on all grid hosts (bastions, task/continuous queue workers, and webservice queue workers). The "-dev" in the name makes it seem like a compile time dependency, but its looking more like a runtime dependency. I need to make time to dig into the libssl one a bit more to figure out if there is a runtime package that is more appropriate for some nodes.

I thought I needed the packages as runtime on the exec nodes. I don't know if anything has changed on them but I do no longer seem to need this as it just works fine now.

Change 486350 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] toolforge: change dev_environ into shell_environ

https://gerrit.wikimedia.org/r/486350

Change 486350 merged by Bstorm:
[operations/puppet@production] toolforge: change dev_environ into shell_environ

https://gerrit.wikimedia.org/r/486350

@zhuyifei1999 can you re-check your missing libmariadb runtime and let us know if the shuffling of things in our Puppet manifests has fixed the problem?

I think so. My next bigger bot task that uses MySQLdb is at 3:42 UTC so we'll see.

Ok, so now my kubernetes webservice of type tcl doesn't work anymore, it is missing libmariadbclient.so.18.

Ok, so now my kubernetes webservice of type tcl doesn't work anymore, it is missing libmariadbclient.so.18.

That would be a completely separate issue. Nothing that has been done tied to this ticket would have any effect at all on the Kubernetes Docker containers. There was a rebuild of the Kubernetes containers at 2019-01-22T20:21UTC which may have inadvertently changed this. Could you file a new issue to track this problem?

I think so. My next bigger bot task that uses MySQLdb is at 3:42 UTC so we'll see.

Yep, it works.