Page MenuHomePhabricator

tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes
Closed, ResolvedPublic

Description

From T157744#3025326 (how meta!):

PyICU requires the underlying C++ library to work, and this is in the libicu52 apt package on Debian Jessie. When this is installed, PyICU can then be installed in a virtualenv. Alternatively, the python-pyicu apt package can be installed, which installs PyICU and the underlying C++ library systemwide. One of the above packages would need to be installed in the python 2 Docker image to be used in Kubernetes containers.

Original request: icu is used for unicode-string list handling/sorting (http://www.linuxfromscratch.org/blfs/view/cvs/general/icu.html).

The discussion has shifted to why the following traceback shows up (despite debugging steps referenced throughout the task):

Traceback (most recent call last):
  File "/data/project/spiarticleanalyzer/www/python/src/app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
  • Are these C libaries, Python packages, or something else? Providing links to the upstream projects would help.
  • What specific runtime environment do you need them in (job grid, kubernetes)?
  • Is this for a project that already works I can view the source of or demo somewhere or a speculative installation to support future work?

@bd808 Python packages (links now provided), been using kubernetes (so that should work), and you can find the relevant rep (https://github.com/JustBerry/SPIArticleAnalyzer), specifically https://github.com/JustBerry/SPIArticleAnalyzer/blob/master/SPIArticleAnalyzer/getAllUsers.py.

@bd808 Also, as far as a demo goes, I have performed a demo locally, which works (and successfully generates and saves an image with basemap).

The fundamental design of the getAllUser.py workflow is going to be difficult to scale for multiple concurrent users on tools.wmflabs.org. I would recommend looking in to using a javascript library and the map tile servers that Wikimedia maintains. See https://www.mediawiki.org/wiki/Maps for more information.

In Toolforge as a matter of policy we only install Python packages that are shipped as part of Ubuntu (Precise/)Trusty; in this case, they are already installed due to T63445 and T102165:

scfc@tools-bastion-03:~$ dpkg-query -l python\*-\*icu\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                 Version                 Architecture            Description
+++-====================================-=======================-=======================-=============================================================================
ii  python-pyicu                         1.5-2ubuntu4            amd64                   Python extension wrapping the ICU C++ API
un  python2.7-pyicu                      <none>                  <none>                  (no description available)
ii  python3-icu                          1.5-2ubuntu4            amd64                   Python 3 extension wrapping the ICU C++ API
ii  python3-pyicu                        1.5-2ubuntu4            amd64                   dummy transitional package for PyICU Python 3 extension
un  python3.4-icu                        <none>                  <none>                  (no description available)
scfc@tools-bastion-03:~$ dpkg-query -l \*matplot\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                 Version                 Architecture            Description
+++-====================================-=======================-=======================-=============================================================================
ii  python-matplotlib                    1.3.1-1ubuntu5          amd64                   Python based plotting system in a style similar to Matlab
ii  python-matplotlib-data               1.3.1-1ubuntu5          all                     Python based plotting system (data package)
ii  python-matplotlib-doc                1.3.1-1ubuntu5          all                     Python based plotting system (documentation package)
ii  python3-matplotlib                   1.3.1-1ubuntu5          amd64                   Python based plotting system in a style similar to Matlab (Python 3)
scfc@tools-bastion-03:~$

So this should already work for you. If you need a different version, you'll have to use virtual environments.

For Kubernetes I believe no packages are installed in the container (?), so you'll have to use virtual environments.

@scfc Hm...

Traceback (most recent call last):

File "getAllUsers.py", line 20, in <module>
  from mpl_toolkits.basemap import Basemap

ImportError: No module named 'mpl_toolkits.basemap'

@JustBerry, sorry, I had only glanced at the links you gave, not looked at your source code. I have installed the package python-mpltoolkits.basemap on tools-bastion-03 which provides:

scfc@tools-bastion-03:~$ dpkg-query -L python-mpltoolkits.basemap
/.
/usr
/usr/lib
/usr/lib/pyshared
/usr/lib/pyshared/python2.7
/usr/lib/pyshared/python2.7/mpl_toolkits
/usr/lib/pyshared/python2.7/mpl_toolkits/basemap
/usr/lib/pyshared/python2.7/mpl_toolkits/basemap/_proj.so
/usr/lib/pyshared/python2.7/mpl_toolkits/basemap/_proj_d.so
/usr/lib/pyshared/python2.7/_geoslib_d.so
/usr/lib/pyshared/python2.7/_geoslib.so
/usr/share
/usr/share/python-support
/usr/share/python-support/python-mpltoolkits.basemap.public
/usr/share/pyshared
/usr/share/pyshared/mpl_toolkits
/usr/share/pyshared/mpl_toolkits/basemap
/usr/share/pyshared/mpl_toolkits/basemap/test.py
/usr/share/pyshared/mpl_toolkits/basemap/__init__.py
/usr/share/pyshared/mpl_toolkits/basemap/shapefile.py
/usr/share/pyshared/mpl_toolkits/basemap/proj.py
/usr/share/pyshared/mpl_toolkits/basemap/cm.py
/usr/share/pyshared/mpl_toolkits/basemap/solar.py
/usr/share/pyshared/mpl_toolkits/basemap/pyproj.py
/usr/share/pyshared/basemap-1.0.7.egg-info
/usr/share/doc
/usr/share/doc/python-mpltoolkits.basemap
/usr/share/doc/python-mpltoolkits.basemap/changelog.Debian.gz
/usr/share/doc/python-mpltoolkits.basemap/copyright
/usr/share/doc/python-mpltoolkits.basemap/FAQ
/usr/share/doc/python-mpltoolkits.basemap/README.gz
scfc@tools-bastion-03:~$

Could you please test if that works?

@scfc Thanks, I'll test it out. Also, are you on IRC by any chance? I've been told that your IRC username is scfc, which appears to be offline (not sure if you're under another username atm perhaps).

@scfc icu import issue:

Traceback (most recent call last):
File "getAllUsers.py", line 9, in <module>
  from getAllUsersHelper import *
File "/mnt/nfs/labstore-secondary-tools-project/spiarticleanalyzer/repo/SPIArticleAnalyzer/getAllUsersHelper.py", line 3, in <module>
  import icu  # pip install PyICU
ImportError: No module named icu

@scfc Previous issue seems to be resolved with a wrapper installation (local vitual environment installation via pip install pyicu). However, the following trackback comes up for python getAllUsers.py.

Traceback (most recent call last):
File "getAllUsers.py", line 11, in <module>
  import matplotlib
ImportError: No module named matplotlib

Previously, I had run pip install matplotlib to install that.

@scfc Ran pip install matplotlib after source $HOME/www/python/virtualenv/bin/activate (in virtualenv). Then, python getAllUsers.py yields

Traceback (most recent call last):
File "getAllUsers.py", line 20, in <module>
  from mpl_toolkits.basemap import Basemap
ImportError: No module named basemap

Just to note, python getAllUsers.py is being run from ~/repo/SPIArticleAnalyzer.

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~/repo/SPIArticleAnalyzer$ python getAllUsers.py

repo is https://github.com/JustBerry/SPIArticleAnalyzer.

@scfc Has sudo apt-get install python-matplotlib and sudo apt-get install python-mpltoolkits.basemap both been run? I'm not sure if the latter has been run. I get the following when searching for what has already been installed globally:

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~/repo/SPIArticleAnalyzer$ dpkg-query -l python\*-\*matplotlib\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                  Version         Architecture    Description
+++-=====================-===============-===============-===============================================
ii  python-matplotlib     1.3.1-1ubuntu5  amd64           Python based plotting system in a style similar
ii  python-matplotlib-dat 1.3.1-1ubuntu5  all             Python based plotting system (data package)
ii  python-matplotlib-doc 1.3.1-1ubuntu5  all             Python based plotting system (documentation pac
ii  python3-matplotlib    1.3.1-1ubuntu5  amd64           Python based plotting system in a style similar

Just to note, python getAllUsers.py is being run from ~/repo/SPIArticleAnalyzer.

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~/repo/SPIArticleAnalyzer$ python getAllUsers.py

repo is https://github.com/JustBerry/SPIArticleAnalyzer.

You shouldn't be running any scripts on the bastion, please follow the documentation on how to use the grid.

@Legoktm ssh justberry@tools-login.wmflabs.org still brings me back to justberry@tools-bastion-03:~$. become spiarticleanalyzer yields tools.spiarticleanalyzer@tools-bastion-03:~$. How do I change over to tools-login?

@JustBerry you are already on it. tools-login is like an alias for tools-bastion-03. That's all the same IP address.

tools-login.wmflabs.org has address 208.80.155.163

host 208.80.155.163
163.155.80.208.in-addr.arpa is an alias for 163.128-25.155.80.208.in-addr.arpa.
163.128-25.155.80.208.in-addr.arpa domain name pointer instance-tools-bastion-03.tools.wmflabs.org.
163.128-25.155.80.208.in-addr.arpa domain name pointer tools-login.wmflabs.org.
163.128-25.155.80.208.in-addr.arpa domain name pointer login.tools.wmflabs.org.

@Legoktm Am I not already on tools-login? @Dzahn I think we seem to be on the same page perhaps.

The documentation at https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Grid does seem to be using @tools-login, which may be the same thing as the bastion.

@Legoktm Perhaps you were referring to using jsub instead of python directly. When I checked python.out and python.err, I don't see any output.

@Legoktm: In this case it was justified to use the bastion as I had only installed the modules there for testing; you're right of course that any resource-intensive or non-interactive use must only happen on the grid.

@JustBerry: No, I'm usually not on IRC; if I'm in a channel I feel compelled to look for and correct any wrong information I see which leaves me no time to do something productive or entertaining :-).

  1. I installed the package python-mpltoolkits.basemap on tools-bastion-03, so you'd need to look for that package.
  2. That package is not installed on grid execution nodes, so if you submit your script as a job to the grid that script cannot use that package and will fail.
  3. Installing the packages on the system is only useful if you use them and not your virtual environment. If you are already using a virtual environment, you should be able to install whatever module you like into that. Looking at https://pypi.python.org/pypi/basemap/, the command would probably be something like pip install basemap.
  4. If I try "check your installation" at http://matplotlib.org/basemap/users/installing.html:
scfc@tools-bastion-03:~$ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from mpl_toolkits.basemap import Basemap
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/__init__.py", line 31, in <module>
    from mpl_toolkits.axes_grid1 import make_axes_locatable
  File "/usr/lib/pymodules/python2.7/mpl_toolkits/axes_grid1/__init__.py", line 4, in <module>
    from axes_grid import Grid, ImageGrid, AxesGrid
  File "/usr/lib/pymodules/python2.7/mpl_toolkits/axes_grid1/axes_grid.py", line 3, in <module>
    import matplotlib.pyplot as plt
  File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 98, in <module>
    _backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
  File "/usr/lib/pymodules/python2.7/matplotlib/backends/__init__.py", line 28, in pylab_setup
    globals(),locals(),[backend_name],0)
  File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_tkagg.py", line 8, in <module>
    import Tkinter as Tk, FileDialog
  File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 42, in <module>
    raise ImportError, str(msg) + ', please install the python-tk package'
ImportError: No module named _tkinter, please install the python-tk package
>>> 
scfc@tools-bastion-03:~$

that "worked" in that it can load the module. After I installed the python-tk package, it works:

scfc@tools-bastion-03:~$ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from mpl_toolkits.basemap import Basemap
>>> 
scfc@tools-bastion-03:~$

but not for Python 3:

scfc@tools-bastion-03:~$ python3
Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from mpl_toolkits.basemap import Basemap
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'mpl_toolkits.basemap'
>>> 
scfc@tools-bastion-03:~$

mpl_toolkits.basemap does not seem to be packaged for Ubuntu Trusty and Python 3, so if you need Python 3, you will need to use a virtual environment.

@scfc Regarding #3 (installing basemap with pip install basemap), see below:

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~$ pip install basemap
Downloading/unpacking basemap
  Could not find any downloads that satisfy the requirement basemap
Cleaning up...
No distributions at all found for basemap
Storing debug log for failure in /data/project/spiarticleanalyzer/.pip/pip.log

@scfc Regarding #4 (python version), the python version being used with python is 2.7.6.

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~$ python --version
Python 2.7.6

@scfc Regarding #1 (looking for python-mpltoolkits.basemap), see below:

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~/repo/SPIArticleAnalyzer$ python getAllUsers.py
Traceback (most recent call last):
  File "getAllUsers.py", line 20, in <module>
    from mpl_toolkits.basemap import Basemap
ImportError: No module named basemap
(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~/repo/SPIArticleAnalyzer$ python --version
Python 2.7.6

It doesn't appear accessible from the global installation (via puppets presumably).

Regarding #2 (executing from grid), I was told that basemap was installed on the grid (and hence I would need to remake virtualenv to the grid engine and submit jobs via jsub).

(virtualenv)tools.spiarticleanalyzer@tools-bastion-03:~$ pip install basemap

Could not find any downloads that satisfy the requirement basemap

https://github.com/matplotlib/basemap/issues/251
https://github.com/matplotlib/basemap/issues/198

"Basemap is to large to be hosted on Pypi so the package is hosted externally and new versions of pip no longer allows that pr default."

@Dzahn Yes, I am aware of this, but thank you for pointing that out. I performed the command anyway on advice of @scfc. I also already tried building this locally (http://matplotlib.org/basemap/users/installing.html), which didn't seem to resolve the issue (same traceback as the ones presented above).

The issue of not being able to access basemap while running on the gridengine was brought up. Hence, I've run the following two tests as well:

webservice uwsgi-python start

uwsgi.log in ~ reads

*** Starting uWSGI 1.9.17.1-debian (64bit) on [Fri Feb 10 13:29:16 2017] ***
compiled with version: 4.8.2 on 23 March 2014 17:15:32
os: Linux-3.13.0-100-generic #147-Ubuntu SMP Tue Oct 18 16:48:51 UTC 2016
nodename: tools-webgrid-generic-1402
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /mnt/nfs/labstore-secondary-tools-project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your processes number limit is 63705
your process address space limit is 4294967296 bytes (4096 MB)
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :53255 fd 3
Python version: 2.7.6 (default, Oct 26 2016, 20:33:43)  [GCC 4.8.4]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x2353360
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 363960 bytes (355 KB) for 4 cores
*** Operational MODE: preforking ***
Traceback (most recent call last):
  File "app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 20, in <module>
    from mpl_toolkits.basemap import Basemap
ImportError: No module named basemap
unable to load app 0 (mountpoint='') (callable not found or import error)
Python version: 2.7.6 (default, Oct 26 2016, 20:33:43)  [GCC 4.8.4]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x2353360
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 363960 bytes (355 KB) for 4 cores
*** Operational MODE: preforking ***
Traceback (most recent call last):
  File "app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 20, in <module>
    from mpl_toolkits.basemap import Basemap
ImportError: No module named basemap
unable to load app 0 (mountpoint='') (callable not found or import error)
mounting /data/project/spiarticleanalyzer/www/python/src/app.py on /spiarticleanalyzer
Traceback (most recent call last):
  File "/data/project/spiarticleanalyzer/www/python/src/app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 20, in <module>
    from mpl_toolkits.basemap import Basemap
ImportError: No module named basemap

webservice --backend=kubernetes python start

uwsgi.log in ~ reads

*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:37:19 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 3.4.2 (default, Oct  8 2014, 10:47:48)  [GCC 4.9.1]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7f399692a0e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7f399692b532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7f399515bbe0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7f399515c62b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7f399515cd9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7f399586a2a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
/usr/bin/uwsgi(uwsgi_setup+0x1073) [0x466c33]
/usr/bin/uwsgi(main+0x9) [0x416989]
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7f399692a0e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7f399692b532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7f399515bbe0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7f399515c62b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7f399515cd9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7f399586a2a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
/usr/bin/uwsgi(uwsgi_setup+0x1073) [0x466c33]
/usr/bin/uwsgi(main+0x9) [0x416989]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f3996916b45]
/usr/bin/uwsgi() [0x4169b7]
*** end of backtrace ***
*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:37:21 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
*** end of backtrace ***
*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:37:21 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 3.4.2 (default, Oct  8 2014, 10:47:48)  [GCC 4.9.1]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7fccb42ac0e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7fccb42ad532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7fccb2addbe0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7fccb2ade62b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7fccb2aded9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7fccb31ec2a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
/usr/bin/uwsgi(uwsgi_setup+0x1073) [0x466c33]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7fccb42ac0e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7fccb42ad532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7fccb2addbe0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7fccb2ade62b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7fccb2aded9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7fccb31ec2a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
/usr/bin/uwsgi(uwsgi_setup+0x1073) [0x466c33]
/usr/bin/uwsgi(main+0x9) [0x416989]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fccb4298b45]
/usr/bin/uwsgi() [0x4169b7]
*** end of backtrace ***
*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:37:37 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
/usr/bin/uwsgi() [0x4169b7]
*** end of backtrace ***
*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:37:37 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 3.4.2 (default, Oct  8 2014, 10:47:48)  [GCC 4.9.1]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7fe1fae910e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7fe1fae92532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7fe1f96c2be0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7fe1f96c362b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7fe1f96c3d9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7fe1f9dd12a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
Python version: 3.4.2 (default, Oct  8 2014, 10:47:48)  [GCC 4.9.1]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7fe1fae910e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7fe1fae92532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7fe1f96c2be0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7fe1f96c362b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7fe1f96c3d9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7fe1f9dd12a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
/usr/bin/uwsgi(uwsgi_setup+0x1073) [0x466c33]
/usr/bin/uwsgi(main+0x9) [0x416989]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe1fae7db45]
/usr/bin/uwsgi() [0x4169b7]
*** end of backtrace ***
*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:38:04 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe1fae7db45]
/usr/bin/uwsgi() [0x4169b7]
*** end of backtrace ***
*** Starting uWSGI 2.0.7-debian (64bit) on [Fri Feb 10 13:38:04 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-4157928641-9cm45
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 3.4.2 (default, Oct  8 2014, 10:47:48)  [GCC 4.9.1]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7fb51f7220e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7fb51f723532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7fb51df53be0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7fb51df5462b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7fb51df54d9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7fb51e6622a1]
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 3.4.2 (default, Oct  8 2014, 10:47:48)  [GCC 4.9.1]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
!!! uWSGI process 1 got Segmentation Fault !!!
*** backtrace of 1 ***
/usr/bin/uwsgi(uwsgi_backtrace+0x30) [0x4635f0]
/usr/bin/uwsgi(uwsgi_segfault+0x21) [0x4639b1]
/lib/x86_64-linux-gnu/libc.so.6(+0x350e0) [0x7fb51f7220e0]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x232) [0x7fb51f723532]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(Py_FatalError+0x50) [0x7fb51df53be0]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(+0x14462b) [0x7fb51df5462b]
/usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0(_Py_InitializeEx_Private+0x23a) [0x7fb51df54d9a]
/usr/lib/uwsgi/plugins/python3_plugin.so(uwsgi_python_init+0x101) [0x7fb51e6622a1]
/usr/bin/uwsgi(uwsgi_start+0x51e) [0x464bee]
/usr/bin/uwsgi(uwsgi_setup+0x1073) [0x466c33]
/usr/bin/uwsgi(main+0x9) [0x416989]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fb51f70eb45]
/usr/bin/uwsgi() [0x4169b7]
*** end of backtrace ***

If you build virtualenv on bastion, it's not gonna work on k8s.
If you build virtualenv on k8s interactive, it's not gonna work on grid.

Different os versions are not compatible.

If you build virtualenv on bastion, it's not gonna work on k8s.
If you build virtualenv on k8s interactive, it not gonna work on grid.

Different os versions is not compatible.

@zhuyifei1999 Thanks for the clarification.

In Toolforge as a matter of policy we only install Python packages that are shipped as part of Ubuntu (Precise/)Trusty; in this case, they are already installed due to T63445 and T102165:

So this should already work for you. If you need a different version, you'll have to use virtual environments.

For Kubernetes I believe no packages are installed in the container (?), so you'll have to use virtual environments.

@scfc It looks like the packages are currently installed for bastion and not for kubernetes or the grid engine. Is this correct? Also, it appears that the packages will not be installed on kubernetes (non-ubuntu) per policy? Is this also correct?

If both of those are the case, can I request installing the packages on the grid?

@scfc It looks like the packages are currently installed for bastion and not for kubernetes or the grid engine. Is this correct? Also, it appears that the packages will not be installed on kubernetes (non-ubuntu) per policy? Is this also correct?

There are no current policy documenting this. The point is, k8s is container-based, and we are trying to make it as lightweight as possible.

For reference:

  • list of packages installed on bastion and grid
  • list of packages installed on bastion only
  • a few lists of packages installed on k8s.

See also:

... and honestly, I can't see why you still need basemap. You have been suggested to use the Wikimedia Maps service as an easy replacement.

@zhuyifei1999 Okay, so the discussion right now seems to be a) use Wikimedia Maps or b) explain why Wikimedia Maps is not sufficient and request installation of basemap on the grid.

Regarding Wikimedia Maps, there is little reference made to an API at https://www.mediawiki.org/wiki/Maps besides "GeoData extension allows articles to specify geographical coordinates, and expose them via search API." The GeoData extension API seems to return articles that are relevant to a specific geographical coordinate, rather than return a world map with dots marking a list of coordinates. Are you referring to https://github.com/kartotherian/kartotherian then (and how it uses .js to make any API calls/map generation on the client side)?

Although Wikimedia Maps is not a direct replacement, I'll look into
leafletjs for now.

Regarding icu, is it installed on kubernetes and/or grid? Seems like a
fairly standard unicode library to install on grid at least. I am using it
to sort a list of unicode strings.

@JustBerry: Sorry, I thought that when I asked you to test on tools-bastion-03 you understood that I meant write a small Python script that does whatever needs to be tested and run that on tools-bastion-03. If instead you run jsub on tools-bastion-03, the script will be run on the grid. I installed the package python-mpltoolkits.basemap only on tools-bastion-03, not on the grid or in Kubernetes.

I just cloned your repository https://github.com/JustBerry/SPIArticleAnalyzer.git. It does not seem to reference Matplotlib at all, and when I run python getAllUsers.py (without any virtual environment), it fails with:

tools.scfc-test-can-be-deleted-anytime@tools-bastion-03:~/SPIArticleAnalyzer/SPIArticleAnalyzer$ python getAllUsers.py 
Traceback (most recent call last):
  File "getAllUsers.py", line 5, in <module>
    import geoip2.database
ImportError: No module named geoip2.database
tools.scfc-test-can-be-deleted-anytime@tools-bastion-03:~/SPIArticleAnalyzer/SPIArticleAnalyzer$

When I replace the references to geoip2 with dummies, it fails with:

tools.scfc-test-can-be-deleted-anytime@tools-bastion-03:~/SPIArticleAnalyzer/SPIArticleAnalyzer$ python getAllUsers.py 
Traceback (most recent call last):
  File "getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "/mnt/nfs/labstore-secondary-tools-project/scfc-test-can-be-deleted-anytime/SPIArticleAnalyzer/SPIArticleAnalyzer/getAllUsersHelper.py", line 2, in <module>
    import ipaddress
ImportError: No module named ipaddress
tools.scfc-test-can-be-deleted-anytime@tools-bastion-03:~/SPIArticleAnalyzer/SPIArticleAnalyzer$

When I replace ipaddress in getAllUsersHelper.py with ipaddr, it succeeds without any output.

(You had a continuous job webservice running which I stopped. webservice is always executed on a bastion host and never with jstart webservice or similar ways.)

And just to avoid any misunderstanding: It's not that I know how to make your application work and am coy about presenting the solution; I simply can't just look at someone else's application and say pointing at some line: "There's your problem!" If you say you need the package python-mpltoolkits.basemap or another one installed for your application to work, then that's easily done. But finding out what package you need or how to fiddle with your virtual environment is not something I can do.

@scfc Ah, okay. Let me clarify a few things.

That being said, @zhuyifei1999 seems to have been successful at building basemap locally in his venv. I'll have to look into what path variables they might have used during installation.

Should https://packages.debian.org/jessie/libs/libicu52 (or another icu lib) be installed on kubernetes (python 2)?

Steps:

  1. webservice --backend=kubernetes python2 shell
  2. cd www/python
  3. rm -rf venv (removing old venv built on virtualenv venv)
  4. virtualenv venv --system-site-packages (yes, I've already built from the bottom up via virtualenv venv too)
  5. source ~/www/python/venv/bin/activate
  6. cd ~/repo/SPIArticleAnalyzer
  7. pip install --upgrade pip
  8. pip install flask (because module not found on python app.py)
  9. pip install mwoauth (because module not found on python app.py)
  10. pip install geoip2 (because module not found on python app.py)
  11. python app.py (again) yields
Traceback (most recent call last):
  File "app.py", line 25, in <module>
    from getAllUsers import *
  File "/data/project/spiarticleanalyzer/repo/SPIArticleAnalyzer/getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "/data/project/spiarticleanalyzer/repo/SPIArticleAnalyzer/getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu
  1. exit (now back to bastion)
  2. pip install ipaddress (because module not found on python app.py)
  3. pip install wtforms (because module not found on python app.py)
  4. webservice --backend=kubernetes python2 start
  5. ~/uwsgi.log reads as follows:
*** Starting uWSGI 2.0.7-debian (64bit) on [Sun Feb 12 20:04:52 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-1723593971-ycmjg
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 2.7.9 (default, Jun 29 2016, 13:11:10)  [GCC 4.9.2]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x1c3f6a0
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 363840 bytes (355 KB) for 4 cores
*** Operational MODE: preforking ***
Traceback (most recent call last):
  File "app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu
unable to load app 0 (mountpoint='') (callable not found or import error)
mounting /data/project/spiarticleanalyzer/www/python/src/app.py on /spiarticleanalyzer
Traceback (most recent call last):
  File "/data/project/spiarticleanalyzer/www/python/src/app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu
*** Starting uWSGI 2.0.7-debian (64bit) on [Sun Feb 12 20:04:59 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-1723593971-ycmjg
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 2.7.9 (default, Jun 29 2016, 13:11:10)  [GCC 4.9.2]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0xeb56a0
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 363840 bytes (355 KB) for 4 cores
*** Operational MODE: preforking ***
Traceback (most recent call last):
  File "app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu
unable to load app 0 (mountpoint='') (callable not found or import error)
mounting /data/project/spiarticleanalyzer/www/python/src/app.py on /spiarticleanalyzer
Traceback (most recent call last):
  File "/data/project/spiarticleanalyzer/www/python/src/app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu
*** Starting uWSGI 2.0.7-debian (64bit) on [Sun Feb 12 20:05:16 2017] ***
compiled with version: 4.9.1 on 25 October 2014 19:17:54
os: Linux-4.4.0-2-amd64 #1 SMP Debian 4.4.2-3+wmf6 (2016-10-18)
nodename: spiarticleanalyzer-1723593971-ycmjg
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /data/project/spiarticleanalyzer
detected binary path: /usr/bin/uwsgi-core
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 2.7.9 (default, Jun 29 2016, 13:11:10)  [GCC 4.9.2]
Set PythonHome to /data/project/spiarticleanalyzer/www/python/venv
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x1fd26a0
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 363840 bytes (355 KB) for 4 cores
*** Operational MODE: preforking ***
Traceback (most recent call last):
  File "app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu
unable to load app 0 (mountpoint='') (callable not found or import error)
mounting /data/project/spiarticleanalyzer/www/python/src/app.py on /spiarticleanalyzer
Traceback (most recent call last):
  File "/data/project/spiarticleanalyzer/www/python/src/app.py", line 25, in <module>
    from getAllUsers import *
  File "./getAllUsers.py", line 8, in <module>
    from getAllUsersHelper import *
  File "./getAllUsersHelper.py", line 3, in <module>
    import icu  # pip install PyICU
ImportError: No module named icu

To clarify, before the step 1 above, I performed the following steps to install icu in my venv (because icu could not be found came up before doing these steps, i.e. w/o icu installed in the venv, in the ~/uwsgi.log, i.e. after doing webservice --backend=kubernetes python2 start; also, created the venv before doing the following steps via via virtualenv venv in the kubernetes shell within ~/www/python):

  1. wget http://download.icu-project.org/files/icu4c/57.1/icu4c-57_1-src.tgz
  2. gunzip -d < icu4c-57_1-src.tgz | tar xvf -
  3. cd icu/source
  4. chmod +x runConfigureICU configure install-sh
  5. ./runConfigureICU Linux --prefix=/data/project/spiarticleanalyzer/www/python/venv
  6. make
  7. make install
  8. webservice --backend=kubernetes python2 start
  9. Bottom of ~/uwsgi.log reads:
Traceback (most recent call last):
  File "./program.py", line 3, in <module>
    import icu
  File "/data/project/project/www/python/venv/local/lib/python2.7/site-packages/icu.py", line 37, in <module>
    from docs import *
  File "/data/project/project/www/python/venv/local/lib/python2.7/site-packages/docs.py", line 23, in <module>
    from _icu import *
ImportError: libicui18n.so.48: cannot open shared object file: No such file or directory

Any lib dependencies issues (like libicui18n's) might as well be rectified with a global installation, so to speak, of icu on both kubernetes and bastion.

JustBerry renamed this task from Requesting installation of mpl_toolkits.basemap, icu for tool.spiarticleanalyzer to Requesting installation of icu for tool.spiarticleanalyzer.Feb 12 2017, 8:41 PM
JustBerry updated the task description. (Show Details)
JustBerry renamed this task from Requesting installation of icu for tool.spiarticleanalyzer to tools.spiarticleanalyzer: requesting installation of icu on bastion and kubernetes .Feb 12 2017, 8:44 PM

I'm a bit lost after all this exchange. I only want to say that PyICU is an important package to offer for us, especially since it's one of the ways to ship and use the CLDR data which Wikimedia users contribute. Having the packages python-pyicu and python3-icu installed is good.

@Nemo_bis To clarify...

In Toolforge as a matter of policy we only install Python packages that are shipped as part of Ubuntu (Precise/)Trusty; in this case, they are already installed due to T63445 and T102165:

scfc@tools-bastion-03:~$ dpkg-query -l python\*-\*icu\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                 Version                 Architecture            Description
+++-====================================-=======================-=======================-=============================================================================
ii  python-pyicu                         1.5-2ubuntu4            amd64                   Python extension wrapping the ICU C++ API
un  python2.7-pyicu                      <none>                  <none>                  (no description available)
ii  python3-icu                          1.5-2ubuntu4            amd64                   Python 3 extension wrapping the ICU C++ API
ii  python3-pyicu                        1.5-2ubuntu4            amd64                   dummy transitional package for PyICU Python 3 extension
un  python3.4-icu                        <none>                  <none>                  (no description available)

So this should already work for you. If you need a different version, you'll have to use virtual environments.

For Kubernetes I believe no packages are installed in the container (?), so you'll have to use virtual environments.

There seem to be two concerns here:

  • No apparent libicu (icu) installed for python 2 (even for bastion)
  • Modules are not (frequently) installed on kubernetes to keep the containers lightweight. Users are asked to build locally in a virtual environment instead. However, when trying to build locally, icu gives errors. After speaking with others, installing the debian jessie distribution of icu will be less error prone than building binaries locally. Alternatively, if someone is able to get icu locally installed in their venv, feel free to post your steps here.

@JustBerry: Why do you think that icu is not installed for Python 2 on bastions? If I do:

scfc@tools-bastion-03:~$ python2
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import icu
>>> 
scfc@tools-bastion-03:~$

that works fine.

@scfc webservice --backend=kubernetes python2 start (kubernetes), though, yields the output posted earlier in the ticket (ImportError: No module named icu). Seeing if there may be a workaround for installing icu on kubernetes, as locally installing in the venv seems to be breaking. A few other people have mentioned that they have also tried installing icu in the venv but were soon confronted with lib issues similar to the ones mentioned earlier.

PyICU requires the underlying C++ library to work, and this is in the libicu52 apt package on Debian Jessie. When this is installed, PyICU can then be installed in a virtualenv. Alternatively, the python-pyicu apt package can be installed, which installs PyICU and the underlying C++ library systemwide. One of the above packages would need to be installed in the python 2 Docker image to be used in Kubernetes containers.

Change 337603 had a related patch set uploaded (by Zhuyifei1999):
Install libicu52 on python & python2 base images

https://gerrit.wikimedia.org/r/337603

PyICU requires the underlying C++ library to work, and this is in the libicu52 apt package on Debian Jessie.

Yes, libicu52 seems to be the latest for debian jessie (as linked in the quote below). gerrit patch seems fine.

Should https://packages.debian.org/jessie/libs/libicu52 (or another icu lib) be installed on kubernetes (python 2)?

When this is installed, PyICU can then be installed in a virtualenv. Alternatively, the python-pyicu apt package can be installed, which installs PyICU and the underlying C++ library systemwide.

To note, the current version of PyICU installed for python 2 is 1.5 (1.9.5 is the most current).

@yuvipanda Since the gerrit patch above probably won't necessarily go live when it is merged, how are we planning to rebuild icu-requesting containers?

Change 337634 had a related patch set uploaded (by Yuvipanda):
python: Install icu dev files

https://gerrit.wikimedia.org/r/337634

Change 337634 merged by jenkins-bot:
python: Install icu dev files

https://gerrit.wikimedia.org/r/337634

@bd808 Thanks for merging. Do(es) particular container(s) need to be rebuilt now? (discussion of next steps)

Mentioned in SAL (#wikimedia-labs) [2017-02-15T00:01:24Z] <bd808> Rebuilt python and python2 Docker images (T157744)

@bd808 Thanks for merging. Do(es) particular container(s) need to be rebuilt now? (discussion of next steps)

Yes

Mentioned in SAL (#wikimedia-labs) [2017-02-15T00:01:24Z] <bd808> Rebuilt python and python2 Docker images (T157744)

{{done}}

Running containers will need to be restarted to pick up the new images.

bd808 claimed this task.
bd808 removed a project: Patch-For-Review.

Please open a new ticket to track additional issues beyond the libicu install.

@bd808 @yuvipanda

webservice --backend=kubernetes python2 shell
python
import icu + press Enter yields:

(venv)tools.spiarticleanalyzer@interactive:~$ python
Python 2.7.9 (default, Jun 29 2016, 13:08:31) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import icu
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named icu

It looks like @yuvipanda committed the dev headers (merged) after @zhuyifei1999 committed the actual libicu52 (not merged). Installing https://packages.debian.org/jessie/python-pyicu should take care of PyICU and libicu52 issues on k8s (debian jessie) globally. Requesting installation of this package via apt-get. Marking task as unresolved.

python-pyicu won't be installed - you should install pyicu inside your virtualenv. We don't want to provide python libraries outside of virtualenvs anymore - that doesn't really scale very well, and ties us down to what's in Debian Jessie. So we'll only install devel files and what not, and you should install the libraries themselves directly from pip.

Kubernetes & Python 2

tools.bd808-test@tools-bastion-02:~$ webservice --backend=kubernetes python2 shell
If you don't see a command prompt, try pressing enter.
tools.bd808-test@interactive:~$
tools.bd808-test@interactive:~$ virtualenv venv-python2-icu
Running virtualenv with interpreter /usr/bin/python2
New python executable in venv-python2-icu/bin/python2
Also creating executable in venv-python2-icu/bin/python
Installing setuptools, pip...done.
tools.bd808-test@interactive:~$ venv-python2-icu/bin/pip install --upgrade pip
Downloading/unpacking pip from https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
  Downloading pip-9.0.1-py2.py3-none-any.whl (1.3MB): 1.3MB downloaded
Installing collected packages: pip
  Found existing installation: pip 1.5.6
    Uninstalling pip:
      Successfully uninstalled pip
Successfully installed pip
Cleaning up...
tools.bd808-test@interactive:~$ venv-python2-icu/bin/pip install pyicu
Collecting pyicu
  Downloading PyICU-1.9.5.tar.gz (181kB)
    100% |████████████████████████████████| 184kB 1.1MB/s
Installing collected packages: pyicu
  Running setup.py install for pyicu ... done
Successfully installed pyicu-1.9.5
tools.bd808-test@interactive:~$ venv-python2-icu/bin/py
python@    python2*   python2.7@
tools.bd808-test@interactive:~$ venv-python2-icu/bin/python2
Python 2.7.9 (default, Jun 29 2016, 13:08:31)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import icu
>>> dir(icu)
['BreakIterator', 'Calendar', 'CanonicalIterator', 'Char', 'CharacterIterator', 'CharsetDetector', 'CharsetMatch', 'ChoiceFormat', 'CollationElementIterator', 'CollationKey', 'Collator', 'CompactDecimalFormat', 'CurrencyAmount', 'CurrencyPluralInfo', 'CurrencyUnit', 'DateFormat', 'DateFormatSymbols', 'DateInterval', 'DateIntervalFormat', 'DateIntervalInfo', 'DateTimePatternGenerator', 'DecimalFormat', 'DecimalFormatSymbols', 'DictionaryBasedBreakIterator', 'FLOATING_TZNAME', 'FieldPosition', 'FilteredNormalizer2', 'FloatingTZ', 'Format', 'Formattable', 'ForwardCharacterIterator', 'GregorianCalendar', 'ICUError', 'ICU_VERSION', 'ICUtzinfo', 'InvalidArgsError', 'LEFontInstance', 'LanguageCode', 'LayoutEngine', 'ListFormatter', 'Locale', 'LocaleData', 'Measure', 'MeasureFormat', 'MeasureUnit', 'MessageFormat', 'Normalizer', 'Normalizer2', 'NumberFormat', 'ParsePosition', 'PluralFormat', 'PluralRules', 'RegexMatcher', 'RegexPattern', 'Replaceable', 'ResourceBundle', 'RuleBasedBreakIterator', 'RuleBasedCollator', 'RuleBasedNumberFormat', 'Script', 'ScriptCode', 'SearchIterator', 'SelectFormat', 'Shape', 'SimpleDateFormat', 'SimpleTimeZone', 'SpoofChecker', 'StringCharacterIterator', 'StringEnumeration', 'StringSearch', 'TimeUnitFormat', 'TimeZone', 'Transliterator', 'UBlockCode', 'UCalendarAMPMs', 'UCalendarDateFields', 'UCalendarDaysOfWeek', 'UCalendarMonths', 'UCharCategory', 'UCharCharacterIterator', 'UCharDirection', 'UCharNameChoice', 'UCollAttribute', 'UCollAttributeValue', 'UCollationResult', 'UCurrencySpacing', 'UDateTimePatternConflict', 'UDateTimePatternField', 'UDateTimePatternMatchOptions', 'UDisplayContext', 'UDisplayContextType', 'UIDNA_ALLOW_UNASSIGNED', 'UIDNA_DEFAULT', 'UIDNA_USE_STD3_RULES', 'ULocDataLocaleType', 'ULocaleDataDelimiterType', 'ULocaleDataExemplarSetType', 'UMatchDegree', 'UMeasurementSystem', 'UNICODE_VERSION', 'UNormalizationCheckResult', 'UNormalizationMode', 'UNormalizationMode2', 'UNumberCompactStyle', 'UObject', 'UProperty', 'UPropertyNameChoice', 'URBNFRuleSetTag', 'URegexpFlag', 'UResType', 'URestrictionLevel', 'USET_ADD_CASE_MAPPINGS', 'USET_CASE_INSENSITIVE', 'USET_IGNORE_SPACE', 'UScriptCode', 'UScriptUsage', 'USearchAttribute', 'USearchAttributeValue', 'USetSpanCondition', 'USpoofChecks', 'UTimeUnitFormatStyle', 'UTransDirection', 'UTransPosition', 'U_COMPARE_CODE_POINT_ORDER', 'U_FOLD_CASE_DEFAULT', 'U_FOLD_CASE_EXCLUDE_SPECIAL_I', 'UnicodeFilter', 'UnicodeFunctor', 'UnicodeMatcher', 'UnicodeSet', 'UnicodeSetIterator', 'UnicodeString', 'VERSION', '__builtins__', '__doc__', '__file__', '__name__', '__package__']
>>>

Kubernetes & Python 3

tools.bd808-test@tools-bastion-02:~$ webservice --backend=kubernetes python shell
If you don't see a command prompt, try pressing enter.
tools.bd808-test@interactive:~$
tools.bd808-test@interactive:~$ python3 -m venv ven
venv-k8s-py2/     venv-python2-icu/
tools.bd808-test@interactive:~$ python3 -m venv venv-python3-icu
tools.bd808-test@interactive:~$ venv-python3-icu/bin/pip
pip*    pip3*   pip3.4*
tools.bd808-test@interactive:~$ venv-python3-icu/bin/pip install --upgrade pip
Downloading/unpacking pip from https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
  Downloading pip-9.0.1-py2.py3-none-any.whl (1.3MB): 1.3MB downloaded
Installing collected packages: pip
  Found existing installation: pip 1.5.6
    Uninstalling pip:
      Successfully uninstalled pip
Successfully installed pip
Cleaning up...
tools.bd808-test@interactive:~$ venv-python3-icu/bin/pip install pyicu
Collecting pyicu
  Using cached PyICU-1.9.5.tar.gz
Installing collected packages: pyicu
  Running setup.py install for pyicu ... done
Successfully installed pyicu-1.9.5
tools.bd808-test@interactive:~$ venv-python3-icu/bin/pyt
python@  python3@
tools.bd808-test@interactive:~$ venv-python3-icu/bin/python3
Python 3.4.2 (default, Oct  8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import icu
>>> dir(icu)
['BreakIterator', 'Calendar', 'CanonicalIterator', 'Char', 'CharacterIterator', 'CharsetDetector', 'CharsetMatch', 'ChoiceFormat', 'CollationElementIterator', 'CollationKey', 'Collator', 'CompactDecimalFormat', 'CurrencyAmount', 'CurrencyPluralInfo', 'CurrencyUnit', 'DateFormat', 'DateFormatSymbols', 'DateInterval', 'DateIntervalFormat', 'DateIntervalInfo', 'DateTimePatternGenerator', 'DecimalFormat', 'DecimalFormatSymbols', 'DictionaryBasedBreakIterator', 'FLOATING_TZNAME', 'FieldPosition', 'FilteredNormalizer2', 'FloatingTZ', 'Format', 'Formattable', 'ForwardCharacterIterator', 'GregorianCalendar', 'ICUError', 'ICU_VERSION', 'ICUtzinfo', 'InvalidArgsError', 'LEFontInstance', 'LanguageCode', 'LayoutEngine', 'ListFormatter', 'Locale', 'LocaleData', 'Measure', 'MeasureFormat', 'MeasureUnit', 'MessageFormat', 'Normalizer', 'Normalizer2', 'NumberFormat', 'ParsePosition', 'PluralFormat', 'PluralRules', 'RegexMatcher', 'RegexPattern', 'Replaceable', 'ResourceBundle', 'RuleBasedBreakIterator', 'RuleBasedCollator', 'RuleBasedNumberFormat', 'Script', 'ScriptCode', 'SearchIterator', 'SelectFormat', 'Shape', 'SimpleDateFormat', 'SimpleTimeZone', 'SpoofChecker', 'StringCharacterIterator', 'StringEnumeration', 'StringSearch', 'TimeUnitFormat', 'TimeZone', 'Transliterator', 'UBlockCode', 'UCalendarAMPMs', 'UCalendarDateFields', 'UCalendarDaysOfWeek', 'UCalendarMonths', 'UCharCategory', 'UCharCharacterIterator', 'UCharDirection', 'UCharNameChoice', 'UCollAttribute', 'UCollAttributeValue', 'UCollationResult', 'UCurrencySpacing', 'UDateTimePatternConflict', 'UDateTimePatternField', 'UDateTimePatternMatchOptions', 'UDisplayContext', 'UDisplayContextType', 'UIDNA_ALLOW_UNASSIGNED', 'UIDNA_DEFAULT', 'UIDNA_USE_STD3_RULES', 'ULocDataLocaleType', 'ULocaleDataDelimiterType', 'ULocaleDataExemplarSetType', 'UMatchDegree', 'UMeasurementSystem', 'UNICODE_VERSION', 'UNormalizationCheckResult', 'UNormalizationMode', 'UNormalizationMode2', 'UNumberCompactStyle', 'UObject', 'UProperty', 'UPropertyNameChoice', 'URBNFRuleSetTag', 'URegexpFlag', 'UResType', 'URestrictionLevel', 'USET_ADD_CASE_MAPPINGS', 'USET_CASE_INSENSITIVE', 'USET_IGNORE_SPACE', 'UScriptCode', 'UScriptUsage', 'USearchAttribute', 'USearchAttributeValue', 'USetSpanCondition', 'USpoofChecks', 'UTimeUnitFormatStyle', 'UTransDirection', 'UTransPosition', 'U_COMPARE_CODE_POINT_ORDER', 'U_FOLD_CASE_DEFAULT', 'U_FOLD_CASE_EXCLUDE_SPECIAL_I', 'UnicodeFilter', 'UnicodeFunctor', 'UnicodeMatcher', 'UnicodeSet', 'UnicodeSetIterator', 'UnicodeString', 'VERSION', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__']
>>>

Commit referenced above abandoned. Task resolved.

Change 337603 abandoned by BryanDavis:
Install libicu52 on python & python2 base images

Reason:
Obsoleted by I0c4c15eda9f1c2ff72fcecc3ee918d8d1af1cbc8

https://gerrit.wikimedia.org/r/337603