Page MenuHomePhabricator

Conda's CPPFLAGS may not be correct when pip installing a package that needs c/cpp compilation
Open, HighPublic

Description

Hello everybody,

while working on the DSE hackathon Neural Mashup track in T292306 there was a problem with conda stacked envs and the python package magenta. If you create a regular conda stacked env (following https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda) and try to pip install magenta, this is the error that you get:

[..]
    src/rtmidi/RtMidi.cpp:1540:10: fatal error: alsa/asoundlib.h: No such file or directory
     #include <alsa/asoundlib.h>
              ^~~~~~~~~~~~~~~~~~

The magenta upstream docs suggest to apt-get install build-essential libasound2-dev libjack-dev, but it doesn't work with our current conda setup. Tried also the following but same error:

conda install -c conda-forge jack alsa-lib

The CPPFLAGS set for me are:

CPPFLAGS=-DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /usr/lib/anaconda-wmf/include
DEBUG_CPPFLAGS=-D_DEBUG -D_FORTIFY_SOURCE=2 -Og -isystem /usr/lib/anaconda-wmf/include

The -isystem IIUC should force the c++ compiler to look for header files into /usr/lib/anaconda-wmf/include, so neither system headers (installed via apt) nor conda-installed ones are picked up (the latter gets deployed afaics into /home/$(whoami)/.conda/envs/YOUR-STACKED-ENV-NAME/include). The following hack works though:

export CPPFLAGS="-DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /usr/lib/anaconda-wmf/include -isystem /home/$(whoami)/.conda/envs/YOUR-STACKED-ENV-NAME/include" (replace YOUR-STACKED-ENV-NAME)

This is surely a corner case since magenta requires python-rtmidi that in turn requires asoundlib.h to compile some c++ files, but I am wondering if we could do something about it in anaconda-wmf. If there is a simpler way apologies for this long task :)

Event Timeline

Yes exactly afaics the CPPFLAGS are set when I activate my stacked conda env. We could try to add:

export CPPFLAGS="${CPPFLAGS} -isystem ${CONDA_PREFIX}/include"

Not sure if too hacky or not, but from my local tests it works fine.

I think the active env path will be available as the CONDA_PREFIX env var.

I think the active env path will be available as the CONDA_PREFIX env var.

Yep way better!

Change 727352 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/debs/anaconda-wmf@debian] Add extra include search path to {CPP,C,CXX,FORTRAN}FLAGS

https://gerrit.wikimedia.org/r/727352

Change 728557 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/debs/anaconda-wmf@debian] Release 2020.02~wmf6

https://gerrit.wikimedia.org/r/728557

Change 727352 merged by Elukey:

[operations/debs/anaconda-wmf@debian] Add extra include search path to {CPP,C,CXX,FORTRAN}FLAGS

https://gerrit.wikimedia.org/r/727352

Change 728557 merged by Elukey:

[operations/debs/anaconda-wmf@debian] Release 2020.02~wmf6

https://gerrit.wikimedia.org/r/728557

Merged the patches, next step is to build the new debian package and install it across our nodes. I can take care of it or leave it to Data Engineering, let me know what you prefer!

We're doing 'offsite' this week so I don't think we'll get to it soon. Please proceed if you need it!

No real need, I think it is fine to wait if anybody wants to get experience with Debian packaging etc..

I was wondering if we could also follow up with upstream about this issue, getting it fixed in there would be nice, but it doesn't seem super easy to find where to apply the fix :D

odimitrijevic moved this task from Incoming to Data Exploration Tools on the Analytics board.

Once this is installed on the servers, will it automatically take effect within users' environments? Or will users have to do something like restart their Jupyter servers or create new Conda environments?

I believe restarting Jupyter servers will be necessary, but that should be all.