Page MenuHomePhabricator

Model Development for NSFW Classifier
Closed, ResolvedPublic


Design and implement a machine learning model which classifies images uploaded to Wikimedia Commons as Safe for Work (SFW) or Not Safe for Work (NSFW). Frameworks like Tensorflow / PyTorch can be used.

The model should not be very computationally intensive and it should also be well tested.

Event Timeline

Why are we trying to build a new model instead of using the one discussed at T214201? I have been using it for years and can attest to its accuracy. I do not think there is a need to reinvent the wheel.

Hi @MusikAnimal! I am the Outreachy intern currently working on this project. There are 4 reasons why we decided to approach this problem with a new vision:

  1. The Caffe models are not built exclusively for Wikimedia but for only Yahoo. And, since we do not have access to the dataset they curated, we have to blindly rely on the weights of the network that was formulated.
  1. Upon reviewing the past comments on its implementation, it can be positively said that it is moderately difficult to implement - which, for a tool of this nature that might be subject to multiple revisions and updates, is a drawback.
  1. Since we rely on a pre-built dataset-trained network, conflicts could arise on the basis of what we classify as NSFW vs what other organizations classify as NSFW.
  1. The model we are working on is much smaller and after compression would be around 35,000 KB.
Chtnnh closed this task as Resolved.EditedSat, Apr 17, 11:04 AM
Chtnnh assigned this task to Harshineesriram.

This task has been successfully completed during the Outreachy Round 21 Internship by @Harshineesriram!

Kindly comment here if you would like further details.