Host a logo detection model for Commons images
Open, In Progress, Needs TriagePublic2 Estimated Story Points
Actions

Assigned To

Authored By

	mfossati
	Feb 28 2024, 3:11 PM

Description

From https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing#Hosting_a_model:

What use case is the model going to support/resolve?

Detection of potentially copyrighted image uploads on Commons. T340546: [XL] Analysis of deletion requests on Commons highlighted that logos account for a significant chunk of files that undergo a deletion request, then get deleted from Commons. See also T340546#9583762.

Do you have a model card? If you don't know what it is, please check https://meta.wikimedia.org/wiki/Machine_learning_models.

Not yet.

What team created/trained/etc.. the model?

Structured Content, main developer @mfossati .

What tools and frameworks have you used?

Keras 3.0.4
TensorFlow 2.15.0 backend
KerasCV 0.8.1
EfficientNet V2 backbone, B0 variant pre-trained on ImageNet. See https://keras.io/api/keras_cv/models/#backbone-presets

What kind of data was the model trained with, and what kind of data the model is going to need in production (for example, calls to internal/external services, special datasources for features, etc..) ?

Commons images. The model will require an image file and will output whether the given file is a logo or not.

If you have a minimal codebase that you used to run the first tests with the model, could you please share it?

Training;
evaluation;
demo classification.

State what team will own the model and please share some main point of contacts (see more info in Ownership of a model).

Structured Content, main point of contact @mfossati, tech lead @Cparle.

What is the current latency and throughput of the model, if you have tested it? We don't need anything precise at this stage, just some ballparks numbers to figure out how the model performs with the expected inputs. For example, does the model take ms/seconds/etc.. to respond to queries? How does it react when 1/10/20/etc.. requests in parallel are made? If you don't have these numbers don't worry, open the task and we'll figure something out while we discuss about next steps!

Not there yet.

Is there an expected frequency in which the model will have to be retrained with new data?

To be discussed. Note that current performances are already high, which may lower the priority of re-training. See T352748: [SPIKE] Image classifier prototype.

What are the resources required to train the model and what was the dataset size?

Training on CPUs with 36 threads was reasonable: I haven't exactly measured the training time, it took a few hours.
Scaling it up to GPUs would be nice to have.
The dataset size is 1.1 GB with 23,325 training samples.

Have you checked if the output of your model is safe from a human rights point of view? Is there any risk of it being offensive for somebody? Even if you have any slight worry or corner case, please tell us!

The output is a confidence score of an image being a logo, so I think it's safe. As a side note, we discussed something similar with Legal as part of T350020: Access request to deleted image files in the production Swift cluster.

Details

Other Assignee: mfossati

Title	Reference	Author	Source Branch	Dest Branch
lw_prototype: add LogoDetectionModel class	mfossati/scriptz!9	kevinbazira	lw_prototype_LogoDetectionModel_class	main
lw_prototype: image download error handling	mfossati/scriptz!8	kevinbazira	lw_prototype_image_download_error_handling	main
lw_prototype: validate input data	mfossati/scriptz!7	kevinbazira	lw_prototype_validate_input_data	main
Improve functionality	mfossati/scriptz!6	mfossati	T358676	main

Customize query in GitLab

Related Objects
Search...

Status	Assigned	Task
Open	None	T349641 [EPIC] MVP Logo machine detection in Upload Wizard
In Progress	kevinbazira	T358676 Host a logo detection model for Commons images
Resolved	kevinbazira	T361803 Create logo-detection model-server to be hosted on LiftWing
Resolved	kevinbazira	T362598 Prepare docker image for hosting the logo-detection model-server on LiftWing
Open	kevinbazira	T362749 Deploy logo-detection model-server to LiftWing staging
Open	kevinbazira	T363449 Configure the logo-detection model-server hosted on LiftWing to process images from Wikimedia Commons
Resolved	kevinbazira	T363294 Support building and running of logo-detection model-server via Makefile
Open	None	T363503 Ignored exception in the logo detection prototype
Open	kevinbazira	T363505 Pass the maximum number of uploads to the logo detection service
Open	kevinbazira	T363506 Pass image objects to the logo detection service
Open	kevinbazira	T367962 Return response time as part of the logo-detection response object

Event Timeline

mfossati created this task.Feb 28 2024, 3:11 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 28 2024, 3:11 PM

calbon assigned this task to kevinbazira.Mar 5 2024, 3:55 PM

calbon set the point value for this task to 2.

calbon moved this task from Unsorted to Ready To Go on the Machine-Learning-Team board.Mar 5 2024, 3:59 PM

mfossati moved this task from Triage to Tracking on the Structured-Data-Backlog board.Mar 6 2024, 3:00 PM

Thank you for providing details about the logo detection project, @mfossati! The ML team is excited to explore hosting it on LiftWing.

We have reviewed the demo you provided and would like to ask a few questions to help us effectively build, host, and provide an API to query the logo detection model server:

1.API input and preprocessing:
In the demo, the model accepts an image directory as input and preprocesses the images within specific subdirectories like 'logo' and 'out_of_domain' (see directory structure in screenshot below).

Could you please clarify how the images will be sent to the API? Will they be sent as one image or multiple images? Will you provide image links or serialized image objects? Additionally, what is the expected size of the images that will be sent?

2.API output:
The demo currently visualizes prediction results in a plotted grid as shown in the screenshot below.

LiftWing typically returns API responses as JSON objects (see example). Could you please specify the expected response format from the API?

In T358676#9629389, @kevinbazira wrote:

Thank you for providing details about the logo detection project, @mfossati! The ML team is excited to explore hosting it on LiftWing.

Cool cool @kevinbazira , thanks for picking this up!
To give you more context, the interaction with the model will happen within Commons Upload Wizard, most likely at its Upload step.

Could you please clarify how the images will be sent to the API? Will they be sent as one image or multiple images?

Multiple images. Users can upload multiple images, but typically it's only one.

Will you provide image links or serialized image objects?

Serialized image objects. More specifically, the Upload Wizard currently uses chunked uploading.

Additionally, what is the expected size of the images that will be sent?

The image size is arbitrary, it will depend on what users are uploading to Upload Wizard. The current upper bound is 5.37 GB, but this is usually for video files.
In any case, the model requires inputs of size 224 x 224 pixels, and the pre-processing step will take care of rescaling to that size.

LiftWing typically returns API responses as JSON objects (see example). Could you please specify the expected response format from the API?

The essentials are predictions with their probability scores. I think that a JSON array of per-file objects is a good fit:

[
    {
        "filename": "my_uploaded_file.jpg",
        "target: "logo",
        "prediction": 0.999,
        "out_of_domain": 0.001 
    },

    ...

]

In the example you provided, I also see the latency key, which looks like a nice to have.

Thank you for providing more context @mfossati. I shared this information with the team, and they have a few more questions to clarify the implementation details for the logo detection API:

In T358676#9637065, @mfossati wrote:

In T358676#9629389, @kevinbazira wrote:

Could you please clarify how the images will be sent to the API? Will they be sent as one image or multiple images?

Multiple images. Users can upload multiple images, but typically it's only one.

To prevent potential DOS vulnerabilities, we need to establish a limit on the number of images that can be sent to the API in a single request. Currently, the upload wizard restricts uploads to 50 files. Would you like us to maintain this limit for the API as well?

To give you more context, the interaction with the model will happen within Commons Upload Wizard, most likely at its Upload step.

Will you provide image links or serialized image objects?

Serialized image objects. More specifically, the Upload Wizard currently uses chunked uploading.

The Upload Wizard documentation warns against processing images from the Upload Stash during the upload step due to potential security risks. Will you implement security checks for serialized image objects before sending them to the API?

LiftWing typically returns API responses as JSON objects (see example). Could you please specify the expected response format from the API?

The essentials are predictions with their probability scores. I think that a JSON array of per-file objects is a good fit.
...
In the example you provided, I also see the latency key, which looks like a nice to have.

Great. Could you please provide a sample API input that specifies the expected parameters and the encoding format for serialized images?

kevinbazira mentioned this in P58864 logo-detection: prototype for JSON response.Mar 21 2024, 11:59 AM

In T358676#9645058, @kevinbazira wrote:

To prevent potential DOS vulnerabilities, we need to establish a limit on the number of images that can be sent to the API in a single request. Currently, the upload wizard restricts uploads to 50 files. Would you like us to maintain this limit for the API as well?

Yes, please.

Will you provide image links or serialized image objects?

Serialized image objects. More specifically, the Upload Wizard currently uses chunked uploading.

The Upload Wizard documentation warns against processing images from the Upload Stash during the upload step due to potential security risks. Will you implement security checks for serialized image objects before sending them to the API?

I spoke with my team and we think that consuming the stash URL doesn't pose security risks. As a result, instead of sending image objects to the LiftWing API, we'll send URLs.

Great. Could you please provide a sample API input that specifies the expected parameters and the encoding format for serialized images?

I think we can send a POST request with the following JSON body:

[
    {
        "filename": "my_uploaded_file.jpg",
        "url": "https://commons.wikimedia.org/wiki/Special:UploadStash/file/my_stash_filekey.png",
        "target" : "logo"
    },

    ...

]

Please let me know if you prefer form-encoded data instead.

Thank you for sharing this information, @mfossati. Based on the requirements you've shared so far, we have worked on a first pass of the prototype that takes the input JSON you specified, preprocesses it similar to the way you did in the demo, and returns the output JSON you specified (see P58917#237712). Please test it and let us know whether we've captured the key requirements correctly before we proceed working on input validation and sanitization, image limits, error handling, etc.

mfossati updated https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/6

Improve functionality

Hey @kevinbazira , I went through P58917, took the liberty of versioning it, and added some changes. Please have a look at https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/6.
Key changes involve:

JSON output
input dataset, which shouldn't have labels
label mode, which shouldn't be binary, since the model is actually multiclass

@kevinbazira: I'm hitting this ignored exception when running the code:

Exception ignored in: <function AtomicFunction.__del__ at 0x157402660>
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cnn/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/atomic_function.py", line 291, in __del__
TypeError: 'NoneType' object is not subscriptable

Not harmful, but worth a check: are you hitting that, too?

Thank you for versioning the liftwing_prototype and making changes @mfossati! I tested the changes locally and got results that I shared in P58917#237822. Please have a look whenever you get a minute.

In T358676#9662292, @mfossati wrote:
@kevinbazira: I'm hitting this ignored exception when running the code:
Exception ignored in: <function AtomicFunction.__del__ at 0x157402660>
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cnn/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/atomic_function.py", line 291, in __del__
TypeError: 'NoneType' object is not subscriptable
Not harmful, but worth a check: are you hitting that, too?

I am not able to reproduce this issue. I've tried running the prototype in two enviroments:

GColab with python 3.10.12, keras==3.0.4, keras-cv==0.8.1, and tensorflow==2.15.0.
local venv with python 3.9.2, keras==3.0.4, keras-cv==0.8.1, and tensorflow==2.16.1.

Could this be caused by an incompatility in the python version shown in the error message: python3.11?

In T358676#9664492, @kevinbazira wrote:

Thank you for versioning the liftwing_prototype and making changes @mfossati! I tested the changes locally and got results that I shared in P58917#237822. Please have a look whenever you get a minute.

See answer in P58917#237851.

I am not able to reproduce this issue. I've tried running the prototype in two enviroments:

GColab with python 3.10.12, keras==3.0.4, keras-cv==0.8.1, and tensorflow==2.15.0.

local venv with python 3.9.2, keras==3.0.4, keras-cv==0.8.1, and tensorflow==2.16.1.

Could this be caused by an incompatility in the python version shown in the error message: python3.11?

Interesting, looks like I'm hitting it with both 3.10.12 and 3.12.2 Python versions in a mamba environment, both on a Linux and a Mac machine. Also hitting it with 3.11.5 in a venv environment on a Mac machine.
Not a blocker, but I suggest to try to resolve this before going to production.

mfossati changed the task status from Open to In Progress.Mar 27 2024, 6:57 PM

mfossati edited projects, added Structured-Data-Backlog (Current Work); removed Structured-Data-Backlog.

mfossati moved this task from Incoming to Doing on the Structured-Data-Backlog (Current Work) board.

mfossati updated Other Assignee, added: mfossati.

mfossati merged https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/6

Improve functionality

kevinbazira opened https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/7

lw_prototype: validate input data

Maintenance_bot removed a project: Patch-For-Review.Mar 28 2024, 11:30 AM

mfossati merged https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/7

lw_prototype: validate input data

kevinbazira opened https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/8

lw_prototype: image download error handling

Maintenance_bot removed a project: Patch-For-Review.Mar 29 2024, 11:30 AM

mfossati merged https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/8

lw_prototype: image download error handling

kevinbazira opened https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/9

lw_prototype: add LogoDetectionModel class

Maintenance_bot removed a project: Patch-For-Review.Apr 3 2024, 10:30 AM

Hi @mfossati ! Thanks a lot for all this great work!
I was wondering if you had tried to train the same model using pytorch as a keras backend instead of tensorflow. The reason I'm asking is totally unrelated to the model itself but has to do with technical challenges of maintaining multiple images and backends. There is ongoing work on our side to provide better support for pytorch (related task).
This is more of a question so we can provide better support and not a request from our side as we'd be supporting keras/tensorflow models as well.

mfossati merged https://gitlab.wikimedia.org/mfossati/scriptz/-/merge_requests/9

lw_prototype: add LogoDetectionModel class

In T358676#9683661, @isarantopoulos wrote:

I was wondering if you had tried to train the same model using pytorch as a keras backend instead of tensorflow.

Hey @isarantopoulos: no, I haven't.

kevinbazira mentioned this in P58917 logo-detection: prototype for JSON input, preprocess with Keras, and return JSON output.Apr 4 2024, 7:14 AM

The prototype looks good to me, I'm excited to see this effort move to the next level!
@kevinbazira, I've especially appreciated the tightness of our development iterations 😄 .

mfossati awarded a token.Apr 4 2024, 8:57 AM

kevinbazira mentioned this in T361803: Create logo-detection model-server to be hosted on LiftWing.Apr 4 2024, 9:28 AM

Thanks @mfossati! <3
It's great to hear you're excited about moving to the next milestone.
Rest assured, in T361803, we'll maintain the tight development iterations and ensure you're kept in the loop at every key milestone as we work towards hosting the logo-detection model-server on LiftWing.

mfossati mentioned this in T363503: Ignored exception in the logo detection prototype.Apr 25 2024, 5:02 PM

kevinbazira closed subtask T363294: Support building and running of logo-detection model-server via Makefile as Resolved.Apr 26 2024, 7:31 AM

mfossati added a parent task: T349641: [EPIC] MVP Logo machine detection in Upload Wizard .Apr 29 2024, 2:17 PM

mfossati added a subtask: T363503: Ignored exception in the logo detection prototype.

mfossati added a subtask: T363505: Pass the maximum number of uploads to the logo detection service.Apr 29 2024, 2:21 PM

mfossati added a subtask: T363506: Pass image objects to the logo detection service.

kevinbazira closed subtask T361803: Create logo-detection model-server to be hosted on LiftWing as Resolved.Apr 30 2024, 4:46 AM

kevinbazira closed subtask T362598: Prepare docker image for hosting the logo-detection model-server on LiftWing as Resolved.May 2 2024, 9:23 AM

AUgolnikova-WMF mentioned this in T362749: Deploy logo-detection model-server to LiftWing staging.May 13 2024, 10:03 AM

	F42616696: logo-detection-prediction.png
	Mar 14 2024, 9:22 AM

	F42616638: logo-detection-image-dir-structure.png
	Mar 14 2024, 9:22 AM

Host a logo detection model for Commons imagesOpen, In Progress, Needs TriagePublic2 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

Host a logo detection model for Commons images
Open, In Progress, Needs TriagePublic2 Estimated Story Points
Actions

Related Objects
Search...