Page MenuHomePhabricator

Proposal (GSoC 2020): Implement an NSFW image classifier with open_nsfw
Closed, DeclinedPublic

Description

Profile Information

Name: Chaitanya Mittal
IRC nickname on Freenode: chtnnh
Web Profile: https://www.github.com/chtnnh
Resume


Location: Dubai, AE
Typical working hours: 18:00 - 02:00 (UTC+4)

Synopsis

  • Short summary describing your project and how it will benefit Wikimedia projects

The proposal elaborates on a scoring system that will be based on Yahoo!'s open_nsfw model. By optimizing memory usage of the deep neural network, the goal will be to achieve upload time classification of images as NSFW or SFW. This threshold can be determined by wiki admin.

The immediate use cases for this model are as follows:

  1. Allow patrollers to curb vandalism that comes in the form of upload of nsfw images to articles that do not require them.
  2. Allow end users to choose whether or not they want to see an NSFW image even if it relevant to the article.

The model would also allow for greater moderation and further work on NSFW/SFW content on Wikis across WikiMedia.

  • Possible Mentor(s): @Halfak
  • Have you contacted your mentors already? Yes!

Deliverables

Days/DatesMilestone/Deadline/Subtask Accomplished
Apr 27 - May 17Community bonding period: spend time interacting with analytics team at WikiMedia, understand common practices and norms
May 18 - May 24Experiment with API implementation of open_nsfw, documenting results in agreed format with mentor.
May 25 - May 31Benchmark API performance and streamline POST requests to achieve higher performance
Jun 1 - Jun 7Investigate existing Machine Vision code in WikiMedia to help streamline integration of API into codebase
Jun 8 - Jun 14Begin integrating API code into existing machine vision code at WikiMedia
Jun 15 - Jun 19Phase 1 Evaluations
Jun 22 - Jun 28Complete integrating API with WikiMedia code
Jun 29 - Jul 5Deploy in test environment, testing and performance benchmarking
Jul 6 - Jul 12Test feedback and Performance improvements discussed with mentor
Jul 13 - Jul 17Phase 2 Evaluations
Jul 18 - Jul 19API code re-integrated with code including test feedback and performance improvements
Jul 20 - Jul 26Deployment into test, performance benchmarking and tesing
Jul 27 - Aug 2Deployment into production
Aug 3 - Aug 9Performance monitoring, Documentation
Aug 10 - Aug 24 Final Evaluation

In addition to code, I plan to start a blog on my portfolio website where I will write about my work on this project once every two weeks. This will help with documentation as well as give certain exposure to WikiMedia AI projects.

Participation

In terms of participation, I plan to communicate mainly through four channels: Phabricator for documented information, IRC for general queries, Zulip for task specific queries and Email for official communication regarding progress.

As far as source code is concerned, I have learnt that the best way to share code is through commits. But in cases where this is not the best option, services like https://codeshare.io could be handy.

About Me

Hi! I am Chaitanya Mittal, an undergrad in Computer Science and Engineering currently in my first year. I am an algorithmic coder and machine learning enthusiast. I have the distinction of qualifying to the Asia Regionals of the ACM ICPC 2018. I have worked with the Mozilla Foundation and the Mifos Foundation previously, though only for a short period of time. I am an open source enthusiast and truly believe in the power it holds to influence the world.

In particular though, I have fallen in love with WikiMedia's vision, "Imagine a world where we can all share freely in the sum of all knowledge" and the fact that it stays true to that. In the spirit of free knowledge and collaborative code, I believe WikiMedia leads by example.

The time frame for the project is from mid May to mid August. I will have summer break from June going on until August end. I will only have minor college engagement during the first two weeks of the project and I will strive to not let it affect my enthusiasm towards the project in any way.

I am eligible for GSoC and am applying with this project.

What does making this project happen mean to you?

Having relied on WikiMedia since childhood, without even realizing it, I understand the role that WIkiMedia plays and has been playing in shaping how knowledge is shared around the world. The successful completion of this project would help improve the way the world obtains information. It would help make their experience safer and a little more pleasant.

It would help a 19 year old realize that collaboration can lead to great things. This is what making this project happen means to me.

Past Experience

Having actively worked in open source for a year now, I have looked for a welcoming community working towards a cause I could relate with. In this process, I have encountered multiple projects (Mozilla, Mifos), developers and tasks. Although it is difficult for me to quantitatively describe this experience, I can affirm that it has helped me become a better developer, I have helped with some tasks here in the WikiMedia community as well!

T245068 is the first task that I have completed.
T246438, T246663 are tasks I am currently working on with @Halfak and have made significant progress in, as of the writing of this proposal.

At a personal level, I actively program competitively and keep myself up to date on the latest machine learning algorithms being developed. I love both Python and C although competitive programming does make me use C++ quite often. I am a native Linux and Bash user and prefer coding in vim or VisualStudio Code.

Any Other Info

References: T214201

Related Projects/Microtasks:

  1. Implement UI for showing NSFW images when relevant to an article
  2. Add user preference for storing whether user want wikis to display NSFW images

Relevant links:

  1. API implementation of open_nsfw
  2. Yahoo! implementation
  3. Caffe, the deep learning framework employed by open_nsfw
This is a first draft and all feedback/suggestions are welcome!

Event Timeline

Chtnnh created this task.Mar 13 2020, 2:35 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 13 2020, 2:35 PM
srishakatux added a comment.EditedMar 13 2020, 10:04 PM

@Chtnnh Hi! Thanks for your proposal. I hope that you are already in touch with potential mentors of this project. It would be nice if you could get early feedback from them on the technical details of your proposal, the feasibility of the approach you have mentioned, and whether or not it fits the scope of a 3-month long project (especially as you are proposing it). If for some reason you are not able to reach out to them, please let me know! Ideally, two projects mentor a GSoC project; both needn't be involved in the same capacity and have a technical background.

I also noticed that you are considering submitting this to Outreachy too? I don't think you can propose your own ideas via Outreachy, but maybe let's double check?

@srishakatux Yes I am already in touch with @Halfak who has agreed to mentor me on this project. I think we can have a discussion on the technical details here, although since I have based this proposal on an existing task, the feasibility and the scope can be easily verified. Could you also get me in touch with another mentor for the project?

Yes, I think we can submit our own ideas via Outreachy, do you have any conflicting information on this?

I think it makes the most sense to focus on page_edit as our target event. Essentially, we would be using the NSFW classifier to flag page edits that add images with NSFW content for review. I'm not sure we really need to flag NSFW content as it is contributed to commons (someone please correct me if I'm wrong).

An important first step would be to find out how much memory and CPU are required to load up the NSFW model and make a prediction about an image. If it has a small footprint, maybe we can use the model directly. If not, we might need to get clever about how we use it in production.

@Chtnnh I confirmed Outreachy doesn't allow applicants to propose their own internship project. You can submit to GSoC still. If/when you move forward in the process, @Halfak might be able to suggest a second mentor.

@srishakatux My bad, but I am passionate about this project. I will be going forward with my GSoC application.

Thank you for the confirmation.

Chtnnh renamed this task from Proposal (GSoC / Outreachy 2020): Implement an NSFW image classifier with open_nsfw to Proposal (GSoC 2020): Implement an NSFW image classifier with open_nsfw.Mar 18 2020, 5:08 PM
Chtnnh updated the task description. (Show Details)
MusikAnimal added subscribers: Mholloway, brion.EditedMar 20 2020, 9:22 PM

Amazing to see more interest in this! I have been fighting image vandalism for quite some time now using open_nsfw, so maybe I can offer some insight.

I think it makes the most sense to focus on page_edit as our target event. Essentially, we would be using the NSFW classifier to flag page edits that add images with NSFW content for review. I'm not sure we really need to flag NSFW content as it is contributed to commons (someone please correct me if I'm wrong).

The diagram at F31679112 explains it perfectly. We definitely do want the images scored at upload time, otherwise we wouldn't be able to use them in AbuseFilter, right? You'd need the scores before the edit is saved, and while open_nsfw is impressively fast, I think it would be too slow to the end user when they're trying to save their edit. If the scores are already stored, ORES or what have you can do a quick lookup to add a score for the page edit, so you'd get the same benefit of being able to flag edits for manual review, or preemptively stop the image from being inserted altogether.

Also, from this task it seems we are setting up the possibility of implementing something like T198550. This I assume would also require that the images themselves be scored, and not the edits.

I think we should be storing scores for local uploads, too, not just Commons.

An important first step would be to find out how much memory and CPU are required to load up the NSFW model and make a prediction about an image. If it has a small footprint, maybe we can use the model directly. If not, we might need to get clever about how we use it in production.

We have it up and running at https://nsfw.wmflabs.org, which is a m1.medium VPS instance with 4 GB RAM and 2 VCPUs. Every image added by new users on English Wikipedia gets sent to this (triggered by Special:AbuseFilter/1001), and to my knowledge it has not once hiccuped in the ~1 year of service. Some metrics available at https://tools.wmflabs.org/nagf/?project=commtech&range=week#h_commtech-nsfw (commtech-nsfw, specifically, not the other instances)

Also, if we score the images at the time of upload, then we don't need any additional CPU on every edit, right? It would just involve a quick db lookup.

I wanted to say, open_nsfw is amazingly accurate out of the box. It even picks up drawings, paintings, etc. (though they usually aren't scored as highly as live imagery). The threshold of > 0.9 is from my experience spot-on.


All of this said, I'm very happy to help with mentoring, too! @Mholloway, @brion and I met several times to discuss implementing this system, but it eventually lost traction. Some notes on our plans can be found at:

They may have some insight to share as well.

Chtnnh added a comment.EditedMar 21 2020, 10:44 AM

Thank you so much for your response @MusikAnimal!

The diagram at F31679112 explains it perfectly. We definitely do want the images scored at upload time, otherwise we wouldn't be able to use them in AbuseFilter, right? You'd need the scores before the edit is saved, and while open_nsfw is impressively fast, I think it would be too slow to the end user when they're trying to save their edit.

What do you suggest we do for local uploads? As far as I understand the situation, we don't have NSFW probabilities for images in commons hence we need to score them too. Also if an image is directly uploaded to a wiki page, is it not then stored in the commons?

Yes we are planning to score the images themselves, just that the trigger for the scoring of the images is page_edit.

Also, from this task it seems we are setting up the possibility of implementing something like T198550

Actually me and @Halfak a discussion on this and concluded that this is a little premature, considering the fact that no wiki has vocally requested for this feature and some have even outright rejected it. Also since this discussion is subjective, it's conclusion might take a significant time to reach.

We have it up and running at https://nsfw.wmflabs.org, which is a m1.medium VPS instance with 4 GB RAM and 2 VCPUs. Every image added by new users on English Wikipedia gets sent to this (triggered by Special:AbuseFilter/1001), and to my knowledge it has not once hiccuped in the ~1 year of service.

That is certainly impressive, considering the specs of the VPS. But can you give me an idea of the number of requests the service has handled in the time that it has been running? I did check out the metrics you have linked above, but couldn't get a holistic view of the same.

I just checked out @Mholloway 's fork of open-nsfw--, it seems to be doing almost exactly what I had in mind. It would give this a project a solid head start!

Also, if we score the images at the time of upload, then we don't need any additional CPU on every edit, right? It would just involve a quick db lookup.

Exactly! That's the plan that I have in mind.


Thank you so much for your interest in mentoring this project! I would be glad to have you as a second mentor with @Halfak
Maybe we can get @srishakatux to guide us on the mentoring details?

What do you suggest we do for local uploads? As far as I understand the situation, we don't have NSFW probabilities for images in commons hence we need to score them too. Also if an image is directly uploaded to a wiki page, is it not then stored in the commons?

The idea we had was for a image_filter_scoring table that maps directly to the image table (more info). I don't know how the media repository stuff works in MediaWiki, but the idea would be the same as if you're requesting to view an image, where it first checks the local wiki and if it's not present then it checks Commons.

Yes we are planning to score the images themselves, just that the trigger for the scoring of the images is page_edit.

So would the scores still be available to AbuseFilter (before the edit is saved)? That's really my only concern, from a counter-vandalism standpoint.

Also, from this task it seems we are setting up the possibility of implementing something like T198550

Actually me and @Halfak a discussion on this and concluded that this is a little premature, considering the fact that no wiki has vocally requested for this feature and some have even outright rejected it. Also since this discussion is subjective, it's conclusion might take a significant time to reach.

Indeed. There was a recent discussion on enwiki about this, which is how I found out about this task. I don't think T198550 is main motivation for this anyway, but it's good to think ahead that this is a possibility. For instance if made the pre-stored scores available via an API the community could easily create a user script, if they wanted to.

That is certainly impressive, considering the specs of the VPS. But can you give me an idea of the number of requests the service has handled in the time that it has been running? I did check out the metrics you have linked above, but couldn't get a holistic view of the same.

The request rate should be identical to the hit rate of filter 1001, which at quick glance is only about one every few minutes. So I guess the fact that the VPS never went down isn't very telling. I can say though that processing individual images doesn't seem to require much CPU.

I should also mention that the model can fail for very large images, returning a score of 500. You can get around this by requesting smaller sizes (see the resolution links below the image on the file page). In my experience the algorithm performs very well even with small-ish images, provided there's enough detail, so you could probably safely downsize to around 800 pixels or so.

Thank you so much for your interest in mentoring this project! I would be glad to have you as a second mentor with Halfak
Maybe we can get srishakatux to guide us on the mentoring details?

I am new to this as well so just let me know what you need from me. I should be upfront that I'm not by any means an expert with machine learning or AI, but I am quite familiar with MediaWiki development and I have a lot of experience fighting image vandalism, so hopefully I'll be of some use :)

Best wishes and cordial prayers to the project and projectholders! I cordially thank Chaitanya for having this project. Love and prayer...❤

The idea we had was for a image_filter_scoring table that maps directly to the image table (more info). I don't know how the media repository stuff works in MediaWiki, but the idea would be the same as if you're requesting to view an image, where it first checks the local wiki and if it's not present then it checks Commons.

I think we will need someone who has an idea of how the MediaWiki repository is structured to guide us through designing these tables.
Conceptually speaking, the tables you and others have come up with works fine, but I am unsure if there are any norms we must follow for this to go into production.

So would the scores still be available to AbuseFilter (before the edit is saved)? That's really my only concern, from a counter-vandalism standpoint.

Maybe not before the edit is made, given that the model takes about 2 seconds to process an image of decent size. But certainly AbuseFilter can access the scores after the edit, to alert patrollers of potentially vandal edits.

Indeed. There was a recent discussion on enwiki about this, which is how I found out about this task. I don't think T198550 is main motivation for this anyway, but it's good to think ahead that this is a possibility. For instance if made the pre-stored scores available via an API the community could easily create a user script, if they wanted to.

Exactly what I thought about the situation! Once we implement this model in production for whatever use case, it is available to wikis for other use cases as well. I agree that the user scripts would be easily written and have no problem giving a template or pseudocode for them as a part of this project either.

The request rate should be identical to the hit rate of filter 1001, which at quick glance is only about one every few minutes.

Ouch, that's a lot less impressive now. But having run the API locally as well, I can agree that the amount of compute and memory it requires is surprisingly cost efficient considering it is a Deep Neural Net. The problem comes with scaling this processing to the number of edits. How many edits per second is the first question in this regard.

I should also mention that the model can fail for very large images, returning a score of 500.

Yes I saw that from the task discussion on T225664. Should be a minor problem to overcome I believe.

I am new to this as well so just let me know what you need from me.

As a mentor you are required to dedicate 4 - 5 hours of your time per week to the project, in the form of helping me out or just generally guiding me on what to do next. To read about this in detail, please head over to https://google.github.io/gsocguides/mentor/


Once again thank you so much for your interest @MusikAnimal ! Thank you @Lazy-restless For your best wishes 😄

Isaac added a subscriber: Miriam.Mar 23 2020, 5:35 PM

https://www.mediawiki.org/wiki/Manual:Image_table

@MusikAnimal I found the link describing the image table! Do check it out.

I think we will need someone who has an idea of how the MediaWiki repository is structured to guide us through designing these tables.
Conceptually speaking, the tables you and others have come up with works fine, but I am unsure if there are any norms we must follow for this to go into production.

I found the link describing the image table! Do check it out.

The image table is for whatever is uploaded to that specific wiki. Commons has it's own image table, but the "repository" thing I was talking about is the logic of where it checks if the image is in the local image table, and if not falls back to Commons. This is the part I don't know much about, but Brion Vibber certainly does, and he designed the schema and implementation plan at https://www.mediawiki.org/wiki/Wikimedia_Product/NSFW_image_classifier/Storage_notes.

So would the scores still be available to AbuseFilter (before the edit is saved)? That's really my only concern, from a counter-vandalism standpoint.

Maybe not before the edit is made, given that the model takes about 2 seconds to process an image of decent size. But certainly AbuseFilter can access the scores after the edit, to alert patrollers of potentially vandal edits.

That defeats the purpose, then :( AbuseFilter is ran just before the edit is saved (and can prevent it from being saved). Retroactive scoring is something that is available now via a bot (though not very discoverable or intuitive). Thanks to the bot and immense human effort, we have most of the long-term abuse involving image vandalism under control now at English Wikipedia. However more will certainly come (and drive-by vandals happen all the time), and other wikis have not been as fortunate. To really put a stop to it, you need the scores ahead of time to prevent the edit from being saved. We have one filter that blocks addition of images matching NSFW keywords, but this has many false positives. If we had the scores upfront, the false positive rate would be close to zero. I highly recommend we score at upload time, per the original implementation plan. It might come to bite us later that we didn't do this from the beginning.

As a mentor you are required to dedicate 4 - 5 hours of your time per week to the project, in the form of helping me out or just generally guiding me on what to do next. To read about this in detail, please head over to https://google.github.io/gsocguides/mentor/

Thanks! I am committed :) Though as you probably know everything has slowed down a bit due to real-life challenges. I assume this effects GSoC as a whole, to some extent, but I personally have much vested interest in fighting image vandalism so I'm happy to go the extra mile on this project.

Brion Vibber certainly does, and he designed the schema and implementation plan at https://www.mediawiki.org/wiki/Wikimedia_Product/NSFW_image_classifier/Storage_notes.

I did go through the page when you first linked to it. I thought that was a suggested design of the database for the implementation of the task. Is it not? Can you please throw some light on the repository/database structure for commons?

AbuseFilter is ran just before the edit is saved (and can prevent it from being saved). Retroactive scoring is something that is available now via a bot (though not very discoverable or intuitive). Thanks to the bot and immense human effort, we have most of the long-term abuse involving image vandalism under control now at English Wikipedia.

This model will not only reduce the human effort, but will also help improve the response times to such acts of vandalism. Although we can enable AbuseFilter to access the scores before the edit is saved at the cost of end-user's time. I think making that decision will require much further discussion with the community as well.

However more will certainly come (and drive-by vandals happen all the time), and other wikis have not been as fortunate. To really put a stop to it, you need the scores ahead of time to prevent the edit from being saved.

I see the point that you are making and do agree with you, but we cannot deny that despite its limitations the model will serve as an improvement to the current infrastructure. What the challenge is, in my opinion, is to push the limitations a bit more to achieve even greater improvements


I assume this effects GSoC as a whole, to some extent, but I personally have much vested interest in fighting image vandalism so I'm happy to go the extra mile on this project.

Thank you so much 😄 I don't know how the situation on the ground will effect the program but I am sure we can work things out. Thank you so much for your interest and support.

I did go through the page when you first linked to it. I thought that was a suggested design of the database for the implementation of the task. Is it not? Can you please throw some light on the repository/database structure for commons?

Correct, this explains the schema of the new table. Commons would have the same table, too. So just as I type [[File:Example.jpg]] on French Wikipedia, it will reference the local Example.jpg only if it exists, otherwise it uses the file with same name on Commons. Fetching scores for images should work in the same way.

This model will not only reduce the human effort, but will also help improve the response times to such acts of vandalism. Although we can enable AbuseFilter to access the scores before the edit is saved at the cost of end-user's time. I think making that decision will require much further discussion with the community as well.

I don't think there's need for community consultation on integrating with AbuseFilter. All we'd be doing is making a variable (containing the scores) available to AbuseFilter, and it's completely up to each community on whether they want to use it.

I think any user-facing surfacing of the scores, such as we do with ORES, is more risky. For instance if I see "NSFW" (probably would be a different name) in the filters at Special:RecentChanges, that might raise eyebrows since the definition of what's offensive varies by culture. If the values are simply stored somewhere and accessible by API, and/or AbuseFilter, it's invisible to users, and AbuseFilter managers and gadget developers, etc., can use them at will or not at all.

A few of the long-term abusers start at English Wikipedia and move on to other wikis, often small wikis that aren't regularly monitored. If we had image scores in AbuseFilter, we could create a global filter to monitor this activity. That would be amazing and replace a lot of manual work with automation.

Fetching scores for images should work in the same way.

I understand now. But I am quite intrigued as to how this would be implemented. Hopefully we can leverage someone's expertise in this regard to simplify the process of implementing this in the project.

I think any user-facing surfacing of the scores, such as we do with ORES, is more risky

I see your point with this. So what I am thinking of now, is that the images are queued to be scored as they are uploaded, with the scores being available to AbuseFilter as soon as they are calculated. The API implementation can be included in the scope of the project to encourage widespread adaption of the service across wikis. We would also require to inform the AbuseFilter managers of the meaning of the NSFW scores and it's relevance as an anti-vandalism tool.

If we had image scores in AbuseFilter, we could create a global filter to monitor this activity. That would be amazing and replace a lot of manual work with automation.

Of course, that'd be the end goal of the project. I am glad we agree that the project could potentially save a lot of human effort.

A lot of necessary, useful and effective planning progress I am watching in the project now! Best wishes once again to Chaitanya and MusikAnimal and also lots of loves and heartiest solid deep prayers...❤❤❤

Thank you so much for your support @Lazy-restless 😄☮

A Gerrit repository has been created for the open-nsfw service.

https://gerrit.wikimedia.org/g/mediawiki/services/open-nsfw

Pavithraes closed this task as Declined.May 5 2020, 7:10 PM

@Chtnnh Congratulations on getting selected for T247847!!! ^_^