Since Analytics currently has a GPU server in testing, now is a good time to check if that particular hardware would work for Thumbor smart cropping and if it does, how it performs.
|operations/puppet : production||Allow gpu-testers to run radeontop|
|Open||Miriam||T215413 Image Classification Working Group|
|Resolved||elukey||T148843 Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models|
|Declined||Gilles||T220811 Test Thumbor OpenCL smart cropping on stat1005|
|Resolved||jijiki||T221562 Build Thumbor packages for buster|
@Gilles we need to build the package for Debian Buster first:
root@install1002:/srv/wikimedia# reprepro lsbycomponent python-thumbor-wikimedia python-thumbor-wikimedia | 2.2-1 | jessie-wikimedia | main | amd64, i386, source python-thumbor-wikimedia | 2.2-1+deb9u1 | stretch-wikimedia | main | amd64, i386, source python-thumbor-wikimedia | 2.4-1+deb9u1 | stretch-wikimedia | component/thumbor | amd64, i386, source
Great, let me know if you need any sort of accuracy evaluation, happy to set up a small experiment to make sure that the detectors work as expected (i.e. they correctly identify *all* types of faces)
gilles@stat1005:~$ apt-cache policy python-thumbor-wikimedia python-thumbor-wikimedia: Installed: (none) Candidate: 2.5-1+deb10u1 Version table: 2.5-1+deb10u1 1001 1001 http://apt.wikimedia.org/wikimedia buster-wikimedia/main amd64 Packages
@elukey can you pleaser install this package on stat1005 now that it's available on Buster? It should pull all the dependencies I need, including Thumbor itself.
First smart crop test on stat1005 successful (face detection):
Now I need to figure out how to a) check that the GPU is correctly leveraged and b) somehow find a way to run thumbor/opencv with and without the GPU to see what the difference is.
Thumbor relies on the python-opencv library for that stuff. I imagine that maybe python-opencv and/or dependencies may have to be compiled from source to leverage the GPU?
I'm not familiar with OpenCL, but I noticed that /etc/OpenCL/vendors contains a amdolc64.icd, which points to libamdocl64.so which is present as /opt/opencl/lib/x86_64/libamdcl64.so
Maybe we need to configure OpenSL (or Py-OpenS)L to specifically usedthat icd) or maybe there's some PATH issue that it doesn't look up /opt/opencl/lib/x86_64/?
@Gilles no idea about what it is needed by thumbor to use the GPU, but in theory I'd expect to find something able to configure it or some logs indicating that it is recognized a GPU.
Also, hsa-ext-rocr-dev might be important: as stated in the other task, https://github.com/RadeonOpenCompute/ROCm/issues/761 gives a good hint about what the package contains:
As of ROCm 1.9, the remaining mechanisms that are available in these closed-source libraries are the hsa_amd_image_*, hsa_ext_image_* and hsa_ext_sampler_* functions (libhsa-ext-image64.so.1) and hsa_ext_tools_* functions (libhsa-runtime-tools64.so). The former are used by, for example, our OpenCL runtime for "image" types. If you don't have libhsa-ext-image64.so.1 installed on your system, our OpenCL runtime will not offer image support for ROCm GPU devices. For example, clGetDeviceInfo() will return CL_FALSE for CL_DEVICE_IMAGE_SUPPORT.
So we could try to install hsa-ext-rocr-dev again and see how it goes!
Note: the package is the only non-open-source one, and we are trying to avoid using it if possible.
I don't have time to pursue this anymore, nor do I have the linux wrangling skills to troubleshoot this issue anyway. I thought it would be a straightforward thing to try out, but it's clear now that it's a small project of its own to get Thumbor to leverage the GPU properly. This is something that can be revisited by whoever inherits Thumbor maintenance.