Page MenuHomePhabricator

`pip install fasttext` fails inside `webservice shell` for lack of RAM
Closed, ResolvedPublic

Description

When I follow these instructions (either for Python 3.7 or Python 3.5), I get the following error when I attempt to install the fastText package. The likely issue with this package is that it requires the "right" C++ compiler: https://github.com/facebookresearch/fastText/blob/master/README.md#requirements. Based on what I'm seeing in the error output below and from some searching, I'd also guess that the issue is the C++ compiling (but obviously I could be wrong).

This is for the wiki-topic tool: https://tools.wmflabs.org/admin/tool/wiki-topic

Notably, I did have fastText installed and working in my virtualenv that I set up back in the Fall. I can't be certain anymore what that environment was, but it likely would have been whatever was default back then.

Error output:

(venv) tools.wiki-topic@interactive:~$ pip install fasttext
Collecting fasttext
  Using cached fasttext-0.9.2.tar.gz (68 kB)
Collecting numpy
  Downloading numpy-1.18.4-cp35-cp35m-manylinux1_x86_64.whl (20.0 MB)
     |████████████████████████████████| 20.0 MB 25 kB/s
Requirement already satisfied: pybind11>=2.2 in ./www/python/venv/lib/python3.5/site-packages (from fasttext) (2.5.0)
Requirement already satisfied: setuptools>=0.7.0 in ./www/python/venv/lib/python3.5/site-packages (from fasttext) (33.1.1)
Building wheels for collected packages: fasttext
  Building wheel for fasttext (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /data/project/wiki-topic/www/python/venv/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-rg55f9c8/fasttext/setup.py'"'"'; __file__='"'"'/tmp/pip-install-rg55f9c8/fasttext/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-imv3npxt
       cwd: /tmp/pip-install-rg55f9c8/fasttext/
  Complete output (43 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.5
  creating build/lib.linux-x86_64-3.5/fasttext
  copying python/fasttext_module/fasttext/__init__.py -> build/lib.linux-x86_64-3.5/fasttext
  copying python/fasttext_module/fasttext/FastText.py -> build/lib.linux-x86_64-3.5/fasttext
  creating build/lib.linux-x86_64-3.5/fasttext/util
  copying python/fasttext_module/fasttext/util/util.py -> build/lib.linux-x86_64-3.5/fasttext/util
  copying python/fasttext_module/fasttext/util/__init__.py -> build/lib.linux-x86_64-3.5/fasttext/util
  creating build/lib.linux-x86_64-3.5/fasttext/tests
  copying python/fasttext_module/fasttext/tests/test_script.py -> build/lib.linux-x86_64-3.5/fasttext/tests
  copying python/fasttext_module/fasttext/tests/test_configurations.py -> build/lib.linux-x86_64-3.5/fasttext/tests
  copying python/fasttext_module/fasttext/tests/__init__.py -> build/lib.linux-x86_64-3.5/fasttext/tests
  running build_ext
  creating tmp
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fdebug-prefix-map=/build/python3.5-3.5.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/data/project/wiki-topic/www/python/venv/include -I/usr/include/python3.5m -c /tmp/tmptzto_aap.cpp -o tmp/tmptzto_aap.o -std=c++14
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fdebug-prefix-map=/build/python3.5-3.5.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/data/project/wiki-topic/www/python/venv/include -I/usr/include/python3.5m -c /tmp/tmp1qswjr6z.cpp -o tmp/tmp1qswjr6z.o -fvisibility=hidden
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  building 'fasttext_pybind' extension
  creating build/temp.linux-x86_64-3.5
  creating build/temp.linux-x86_64-3.5/python
  creating build/temp.linux-x86_64-3.5/python/fasttext_module
  creating build/temp.linux-x86_64-3.5/python/fasttext_module/fasttext
  creating build/temp.linux-x86_64-3.5/python/fasttext_module/fasttext/pybind
  creating build/temp.linux-x86_64-3.5/src
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fdebug-prefix-map=/build/python3.5-3.5.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/data/project/wiki-topic/www/python/venv/lib/python3.5/site-packages/pybind11/include -I/data/project/wiki-topic/www/python/venv/lib/python3.5/site-packages/pybind11/include -Isrc -I/data/project/wiki-topic/www/python/venv/include -I/usr/include/python3.5m -c python/fasttext_module/fasttext/pybind/fasttext_pybind.cc -o build/temp.linux-x86_64-3.5/python/fasttext_module/fasttext/pybind/fasttext_pybind.o -DVERSION_INFO="0.9.2" -std=c++14 -fvisibility=hidden
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In lambda function:
  python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:345:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
               for (int32_t i = 0; i < vocab_freq.size(); i++) {
                                   ~~^~~~~~~~~~~~~~~~~~~
  python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In lambda function:
  python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:359:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
               for (int32_t i = 0; i < labels_freq.size(); i++) {
                                   ~~^~~~~~~~~~~~~~~~~~~~
  x86_64-linux-gnu-gcc: internal compiler error: Killed (program cc1plus)
  Please submit a full bug report,
  with preprocessed source if appropriate.
  See <file:///usr/share/doc/gcc-6/README.Bugs> for instructions.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 4
  ----------------------------------------
  ERROR: Failed building wheel for fasttext

Event Timeline

Update: I verified that this still works using the old grid-engine process -- specifically, if I don't go into the Kubernetes shell but instead execute the following commands, fasttext properly installs:

isaacj@tools-sgebastion-07:~$ become wiki-topic
tools.wiki-topic@tools-sgebastion-07:~$ cd www/python/
tools.wiki-topic@tools-sgebastion-07:~/www/python$ virtualenv --python=python3 ./venv
Already using interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /mnt/nfs/labstore-secondary-tools-project/wiki-topic/www/python/venv/bin/python3
Also creating executable in /mnt/nfs/labstore-secondary-tools-project/wiki-topic/www/python/venv/bin/python
Installing setuptools, pkg_resources, pip, wheel...done.
tools.wiki-topic@tools-sgebastion-07:~/www/python$ source venv/bin/activate
(venv) tools.wiki-topic@tools-sgebastion-07:~/www/python$ pip install fasttext
Collecting fasttext
  Using cached fasttext-0.9.2.tar.gz (68 kB)
Requirement already satisfied: pybind11>=2.2 in ./venv/lib/python3.5/site-packages (from fasttext) (2.5.0)
Requirement already satisfied: setuptools>=0.7.0 in ./venv/lib/python3.5/site-packages (from fasttext) (46.3.0)
Collecting numpy
  Using cached numpy-1.18.4-cp35-cp35m-manylinux1_x86_64.whl (20.0 MB)
Building wheels for collected packages: fasttext
  Building wheel for fasttext (setup.py) ... done
  Created wheel for fasttext: filename=fasttext-0.9.2-cp35-cp35m-linux_x86_64.whl size=3025637 sha256=1963623d7e4f2bb92e3a3b0bbe356ecfd73af9abfe04b513dfffcd24e451bfc2
  Stored in directory: /mnt/nfs/labstore-secondary-tools-project/wiki-topic/.cache/pip/wheels/06/27/f4/1d715a6c4f03222b1f301b54f23465e204f1e3d098864af7d7
Successfully built fasttext
Installing collected packages: numpy, fasttext
Successfully installed fasttext-0.9.2 numpy-1.18.4
(venv) tools.wiki-topic@tools-sgebastion-07:~/www/python$

In an ideal world the upstream maintainers for fasttext would actually be publishing a wheel to avoid everyone having to recompile this, but that's not a problem we can fix in Toolforge. :)

The giant pile of warnings from the compilation are making it difficult to see the real problem which is:
x86_64-linux-gnu-gcc: internal compiler error: Killed (program cc1plus)

The webservice shell command launches a pod with our default resource limits of 500m CPU (0.5 cores) and 512Mi RAM. Compiling this c++ library is failing because the compiler process is hitting the memory limit and being killed.

Sadly, the --mem and --cpu arguments to webservice are not currently being applied to webservice shell commands, but I was able to verify that giving more RAM is a fix with manual use of kubectl:

$ ssh dev.toolforge.org
$ become bd808-test
$ kubectl run interactive --image=docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest --restart=Never --command=true --env=HOME=$HOME --labels='toolforge=tool' --rm=true --stdin=true --tty=true --requests='cpu=1,memory=1Gi' --limits='cpu=1,memory=4Gi' -- /bin/bash -il
$ cd
$ python3 -mvenv T252700
$ $ T252700/bin/pip3 install -U pip wheel
Cache entry deserialization failed, entry ignored
Collecting pip
  Downloading https://files.pythonhosted.org/packages/54/2e/df11ea7e23e7e761d484ed3740285a34e38548cf2bad2bed3dd5768ec8b9/pip-20.1-py2.py3-none-any.whl (1.5MB)
    100% |████████████████████████████████| 1.5MB 83kB/s
Collecting wheel
  Downloading https://files.pythonhosted.org/packages/8c/23/848298cccf8e40f5bbb59009b32848a4c38f4e7f3364297ab3c3e2e2cd14/wheel-0.34.2-py2.py3-none-any.whl
Installing collected packages: pip, wheel
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-20.1 wheel-0.34.2
$ T252700/bin/pip3 install fasttext
Collecting fasttext
  Downloading fasttext-0.9.2.tar.gz (68 kB)
     |████████████████████████████████| 68 kB 1.7 MB/s
Collecting numpy
  Downloading numpy-1.18.4-cp37-cp37m-manylinux1_x86_64.whl (20.2 MB)
     |████████████████████████████████| 20.2 MB 5.8 kB/s
Requirement already satisfied: pybind11>=2.2 in ./T252700/lib/python3.7/site-packages (from fasttext) (2.5.0)
Requirement already satisfied: setuptools>=0.7.0 in ./T252700/lib/python3.7/site-packages (from fasttext) (40.8.0)
Building wheels for collected packages: fasttext
  Building wheel for fasttext (setup.py) ... done
  Created wheel for fasttext: filename=fasttext-0.9.2-cp37-cp37m-linux_x86_64.whl size=4156029 sha256=303fbcd8a618fd12c924e4be23009cf5b99a72e824a57ccaf9fe9af1d52a49d1
  Stored in directory: /data/project/bd808-test/.cache/pip/wheels/4e/ca/bf/b020d2be95f7641801a6597a29c8f4f19e38f9c02a345bab9b
Successfully built fasttext
Installing collected packages: numpy, fasttext
Successfully installed fasttext-0.9.2 numpy-1.18.4
$ T252700/bin/python3
Python 3.7.3 (default, Dec 20 2019, 18:57:59)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fasttext
>>> 
$ exit

@Isaac, you can unblock yourself by borrowing that very long kubectl run ... command and using it to launch a python3.7 container to build out your virtual env. The command is roughly equivalent to running webservice --backend=kubernetes --mem 4Gi --cpu 1 python3.7 shell. The difference is really that it actually increases the hard memory limit which doesn't happen at the moment with the webservice command. I will put in a patch to fix that to make this easier for others in the future.

bd808 renamed this task from Error when pip install fasttext library on Toolforge (Kubernetes; Python 3.7) to `pip install fasttext` fails inside `webservice shell` for lack of RAM.May 13 2020, 10:02 PM
bd808 triaged this task as Medium priority.
bd808 updated the task description. (Show Details)

Change 596306 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/software/tools-webservice@master] Apply --mem and --cpu to kubernetes shell pods

https://gerrit.wikimedia.org/r/596306

Change 596306 merged by jenkins-bot:
[operations/software/tools-webservice@master] Apply --mem and --cpu to kubernetes shell pods

https://gerrit.wikimedia.org/r/596306

thanks! tried the approach mentioned in T252700#6135264 and worked like a charm!

@bd808 we can fast-track deploy that tomorrow. I could totally do it, but if you would like to try the new process, I can hold back.

This caused a deployment failure on InteractionTimeline

1tools.interaction-timeline@interactive:~/tool/client$ npm run build
2
3> @ build /data/project/interaction-timeline/tool/client
4> NODE_ENV=production webpack
5
6clean-webpack-plugin: /data/project/interaction-timeline/tool/html has been removed.
7clean-webpack-plugin: 1 file(s) excluded - api
8Killed
9npm ERR! code ELIFECYCLE
10npm ERR! errno 137
11npm ERR! @ build: `NODE_ENV=production webpack`
12npm ERR! Exit status 137
13npm ERR!
14npm ERR! Failed at the @ build script.
15npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
16
17npm ERR! A complete log of this run can be found in:
18npm ERR! /data/project/interaction-timeline/.npm/_logs/2020-05-21T21_35_51_289Z-debug.log

Mentioned in SAL (#wikimedia-cloud) [2020-05-21T22:36:21Z] <bd808> Updated tools-webservice to 0.70 across instances (T252700)

Mentioned in SAL (#wikimedia-cloud) [2020-05-21T22:40:32Z] <bd808> Rebuilding all Docker containers for tools-webservice 0.70 (T252700)

@bd808 should I try again?

Sure! I had started to write something here about the current state and then apparently tabbed out and never came back. Webservice 0.70 is on the Toolforge bastions now. It now supports the -mem X argument when running the shell action. The default memory assignment is still low, so you will need to bump it up. Maybe try -mem 2Gi first to see if that is enough for your compile needs.