Page MenuHomePhabricator

Test Thumbor OpenCL smart cropping on stat1005
Open, NormalPublic

Description

Since Analytics currently has a GPU server in testing, now is a good time to check if that particular hardware would work for Thumbor smart cropping and if it does, how it performs.

Event Timeline

Gilles created this task.Apr 12 2019, 12:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 12 2019, 12:14 PM
Gilles renamed this task from Test Thumbor OpenCL on stat1005 to Test Thumbor OpenCL smart cropping on stat1005.Apr 12 2019, 12:14 PM
Gilles triaged this task as Low priority.
Gilles added a subscriber: elukey.Apr 22 2019, 2:47 PM

@elukey can you please install the "python-thumbor-wikimedia" debian package on that host? I believe it should pull all the required dependencies, including thumbor itself.

jijiki added a subscriber: jijiki.Apr 22 2019, 4:01 PM

@Gilles we need to build the package for Debian Buster first:

root@install1002:/srv/wikimedia# reprepro lsbycomponent python-thumbor-wikimedia
python-thumbor-wikimedia |        2.2-1 |  jessie-wikimedia |              main | amd64, i386, source
python-thumbor-wikimedia | 2.2-1+deb9u1 | stretch-wikimedia |              main | amd64, i386, source
python-thumbor-wikimedia | 2.4-1+deb9u1 | stretch-wikimedia | component/thumbor | amd64, i386, source

Not sure if this is already in progress for another task, maybe @jijiki or @MoritzMuehlenhoff know it?

Miriam added a subscriber: Miriam.Apr 26 2019, 11:31 AM

@Gilles do you plan to test Thumbor's face detection functionalities too?

Yes, both 3D rendering of STL files and "smart cropping" (face & feature detection).

Miriam added a comment.EditedApr 30 2019, 10:52 AM

Yes, both 3D rendering of STL files and "smart cropping" (face & feature detection).

Great, let me know if you need any sort of accuracy evaluation, happy to set up a small experiment to make sure that the detectors work as expected (i.e. they correctly identify *all* types of faces)

jijiki moved this task from Backlog/Radar to In Progress on the User-jijiki board.May 13 2019, 8:36 AM
gilles@stat1005:~$ apt-cache policy python-thumbor-wikimedia
python-thumbor-wikimedia:
  Installed: (none)
  Candidate: 2.5-1+deb10u1
  Version table:
     2.5-1+deb10u1 1001
       1001 http://apt.wikimedia.org/wikimedia buster-wikimedia/main amd64 Packages

@elukey can you pleaser install this package on stat1005 now that it's available on Buster? It should pull all the dependencies I need, including Thumbor itself.

@Gilles done, let me know if you need more help.

Gilles raised the priority of this task from Low to Normal.May 27 2019, 5:06 AM
jijiki moved this task from In Progress to St on the User-jijiki board.Jun 18 2019, 9:56 PM

@jijiki I also need python-opencv installed on that host, thanks :)

Mentioned in SAL (#wikimedia-operations) [2019-06-21T06:44:21Z] <moritzm> installed python-opencv on stat1005 (T220811)

First smart crop test on stat1005 successful (face detection):

Now I need to figure out how to a) check that the GPU is correctly leveraged and b) somehow find a way to run thumbor/opencv with and without the GPU to see what the difference is.

Mentioned in SAL (#wikimedia-operations) [2019-06-21T06:54:44Z] <moritzm> installed radeontop on stat1005 to diagnose GPU usage (T220811)

Change 518210 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Allow gpu-testers to run radeontop

https://gerrit.wikimedia.org/r/518210

There is also /opt/rocm/bin/rocm_smi.py but apparently it needs sudo (sigh).

After rocm 2.5 is release (we should be running 2.4) T220784 will be unblocked and we'll be able to write a simple prometheus exporter and publish metrics..

@elukey I have Thumbor configured to do smart cropping, but with @MoritzMuehlenhoff we verified using both radeontop and rocm_smi and it's not hitting the GPU.

Thumbor relies on the python-opencv library for that stuff. I imagine that maybe python-opencv and/or dependencies may have to be compiled from source to leverage the GPU?

Thumbor relies on the python-opencv library for that stuff. I imagine that maybe python-opencv and/or dependencies may have to be compiled from source to leverage the GPU?

I'm not familiar with OpenCL, but I noticed that /etc/OpenCL/vendors contains a amdolc64.icd, which points to libamdocl64.so which is present as /opt/opencl/lib/x86_64/libamdcl64.so

Maybe we need to configure OpenSL (or Py-OpenS)L to specifically usedthat icd) or maybe there's some PATH issue that it doesn't look up /opt/opencl/lib/x86_64/?

elukey added a comment.EditedJun 21 2019, 1:26 PM

@Gilles no idea about what it is needed by thumbor to use the GPU, but in theory I'd expect to find something able to configure it or some logs indicating that it is recognized a GPU.

Also, hsa-ext-rocr-dev might be important: as stated in the other task, https://github.com/RadeonOpenCompute/ROCm/issues/761 gives a good hint about what the package contains:

As of ROCm 1.9, the remaining mechanisms that are available in these closed-source libraries are the hsa_amd_image_*, hsa_ext_image_* and hsa_ext_sampler_* functions (libhsa-ext-image64.so.1) and hsa_ext_tools_* functions (libhsa-runtime-tools64.so).

The former are used by, for example, our OpenCL runtime for "image" types. If you don't have libhsa-ext-image64.so.1 installed on your system, our OpenCL runtime will not offer image support for ROCm GPU devices. For example, clGetDeviceInfo() will return CL_FALSE for CL_DEVICE_IMAGE_SUPPORT.

So we could try to install hsa-ext-rocr-dev again and see how it goes!

Note: the package is the only non-open-source one, and we are trying to avoid using it if possible.