Page MenuHomePhabricator

Test Thumbor OpenCL smart cropping on stat1005
Closed, DeclinedPublic

Description

Since Analytics currently has a GPU server in testing, now is a good time to check if that particular hardware would work for Thumbor smart cropping and if it does, how it performs.

Event Timeline

Gilles renamed this task from Test Thumbor OpenCL on stat1005 to Test Thumbor OpenCL smart cropping on stat1005.Apr 12 2019, 12:14 PM
Gilles triaged this task as Low priority.

@elukey can you please install the "python-thumbor-wikimedia" debian package on that host? I believe it should pull all the required dependencies, including thumbor itself.

@Gilles we need to build the package for Debian Buster first:

root@install1002:/srv/wikimedia# reprepro lsbycomponent python-thumbor-wikimedia
python-thumbor-wikimedia |        2.2-1 |  jessie-wikimedia |              main | amd64, i386, source
python-thumbor-wikimedia | 2.2-1+deb9u1 | stretch-wikimedia |              main | amd64, i386, source
python-thumbor-wikimedia | 2.4-1+deb9u1 | stretch-wikimedia | component/thumbor | amd64, i386, source

Not sure if this is already in progress for another task, maybe @jijiki or @MoritzMuehlenhoff know it?

@Gilles do you plan to test Thumbor's face detection functionalities too?

Yes, both 3D rendering of STL files and "smart cropping" (face & feature detection).

Yes, both 3D rendering of STL files and "smart cropping" (face & feature detection).

Great, let me know if you need any sort of accuracy evaluation, happy to set up a small experiment to make sure that the detectors work as expected (i.e. they correctly identify *all* types of faces)

gilles@stat1005:~$ apt-cache policy python-thumbor-wikimedia
python-thumbor-wikimedia:
  Installed: (none)
  Candidate: 2.5-1+deb10u1
  Version table:
     2.5-1+deb10u1 1001
       1001 http://apt.wikimedia.org/wikimedia buster-wikimedia/main amd64 Packages

@elukey can you pleaser install this package on stat1005 now that it's available on Buster? It should pull all the dependencies I need, including Thumbor itself.

@Gilles done, let me know if you need more help.

Gilles raised the priority of this task from Low to Medium.May 27 2019, 5:06 AM

@jijiki I also need python-opencv installed on that host, thanks :)

Mentioned in SAL (#wikimedia-operations) [2019-06-21T06:44:21Z] <moritzm> installed python-opencv on stat1005 (T220811)

First smart crop test on stat1005 successful (face detection):

smart.jpg (200×300 px, 14 KB)

Now I need to figure out how to a) check that the GPU is correctly leveraged and b) somehow find a way to run thumbor/opencv with and without the GPU to see what the difference is.

Mentioned in SAL (#wikimedia-operations) [2019-06-21T06:54:44Z] <moritzm> installed radeontop on stat1005 to diagnose GPU usage (T220811)

Change 518210 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Allow gpu-testers to run radeontop

https://gerrit.wikimedia.org/r/518210

There is also /opt/rocm/bin/rocm_smi.py but apparently it needs sudo (sigh).

After rocm 2.5 is release (we should be running 2.4) T220784 will be unblocked and we'll be able to write a simple prometheus exporter and publish metrics..

@elukey I have Thumbor configured to do smart cropping, but with @MoritzMuehlenhoff we verified using both radeontop and rocm_smi and it's not hitting the GPU.

Thumbor relies on the python-opencv library for that stuff. I imagine that maybe python-opencv and/or dependencies may have to be compiled from source to leverage the GPU?

Thumbor relies on the python-opencv library for that stuff. I imagine that maybe python-opencv and/or dependencies may have to be compiled from source to leverage the GPU?

I'm not familiar with OpenCL, but I noticed that /etc/OpenCL/vendors contains a amdolc64.icd, which points to libamdocl64.so which is present as /opt/opencl/lib/x86_64/libamdcl64.so

Maybe we need to configure OpenSL (or Py-OpenS)L to specifically usedthat icd) or maybe there's some PATH issue that it doesn't look up /opt/opencl/lib/x86_64/?

@Gilles no idea about what it is needed by thumbor to use the GPU, but in theory I'd expect to find something able to configure it or some logs indicating that it is recognized a GPU.

Also, hsa-ext-rocr-dev might be important: as stated in the other task, https://github.com/RadeonOpenCompute/ROCm/issues/761 gives a good hint about what the package contains:

As of ROCm 1.9, the remaining mechanisms that are available in these closed-source libraries are the hsa_amd_image_*, hsa_ext_image_* and hsa_ext_sampler_* functions (libhsa-ext-image64.so.1) and hsa_ext_tools_* functions (libhsa-runtime-tools64.so).

The former are used by, for example, our OpenCL runtime for "image" types. If you don't have libhsa-ext-image64.so.1 installed on your system, our OpenCL runtime will not offer image support for ROCm GPU devices. For example, clGetDeviceInfo() will return CL_FALSE for CL_DEVICE_IMAGE_SUPPORT.

So we could try to install hsa-ext-rocr-dev again and see how it goes!

Note: the package is the only non-open-source one, and we are trying to avoid using it if possible.

I don't have time to pursue this anymore, nor do I have the linux wrangling skills to troubleshoot this issue anyway. I thought it would be a straightforward thing to try out, but it's clear now that it's a small project of its own to get Thumbor to leverage the GPU properly. This is something that can be revisited by whoever inherits Thumbor maintenance.

Change 518210 abandoned by Elukey:
Allow gpu-testers to run radeontop

Reason:
Created https://gerrit.wikimedia.org/r/#/c/587726/

https://gerrit.wikimedia.org/r/518210