Page MenuHomePhabricator

Update open_nsfw-- for Wikimedia production deployment
Closed, ResolvedPublic

Description

This task encompasses the work needed to make the open_nsfw-- service discussed in T214201, and currently available via https://nsfw.wmflabs.org, available internally to Wikimedia software components in our production environment. There is no plan to expose this service publicly from production.

Open_nsfw evaluates an image against an AI-derived model of NSFW likelihood and responds with a single floating point value between 0.0 (least likely NSFW) to 1.0 (most likely NSFW). The open_nsfw-- project exposes this component via an HTTP API deployable with Docker.

Example:

$ curl -d 'url=https://upload.wikimedia.org/wikipedia/commons/thumb/b/b3/Jimmy_Wales_in_August_2006.jpg/319px-Jimmy_Wales_in_August_2006.jpg' https://nsfw.wmflabs.org
0.006064464803785086

This functionality is urgently needed by the Community Tech team to prevent abuse, and @MusikAnimal is currently working on integrating the output of this service into AbuseFilter.

Todo

  • Replace existing Dockerfile with a Blubberfile
  • OpenAPI spec
  • GET endpoint for health checks
  • metrics
  • logging
  • tests

Service name: nsfwoid
Timeline: As soon as feasible
Technologies: Python, Docker
Repository: https://github.com/mdholloway/nsfwoid
Point person: @Mholloway, @MusikAnimal

Event Timeline

Restricted Application added projects: Operations, Services. · View Herald TranscriptJun 12 2019, 7:10 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Mholloway updated the task description. (Show Details)Jun 12 2019, 7:11 PM
mobrovac edited subscribers, added: Joe, akosiaris; removed: Aklapper.
ArielGlenn triaged this task as Normal priority.Jun 13 2019, 7:32 AM
Joe added a comment.EditedJun 17 2019, 10:04 AM

Hi! A very quick skim of the upstream project suggests me there is no storage need, is this correct?

Besides this - I would assume we should treat this as any other service we want to deploy in production. At a minimum, a service should be observable and monitorable.

So we will need to add at least:

  • Metrics reporting (possibly in prometheus format)
  • Logging
  • a GET endpoint (/healthz ) as a readiness/liveness probe as we're going to deploy this on kubernetes
  • Ability to accept the image in the request [1]
  • An openapi spec (possibly including an example)
  • the .pipeline directory, and basing build on blubber and not the dockerfile contained in the project which is, to put it mildly, questionable.

In general, the project seems like a stub of an api around the models, and would not pass the bar to be deployed by us in the current state. I also think it won't require much work to get into a deployable state.

Once this is done, you can follow the usual procedure for a new service request.

[1] I don't know the specifics of the use of this service, but I can see all sorts of race-conditions between this service being called and an image being available on upload.wikimedia.org, and in what revision/cached state

fsero moved this task from Backlog to Goal tasks on the serviceops board.Jun 20 2019, 2:11 PM
akosiaris moved this task from Goal tasks to Backlog on the serviceops board.Jun 21 2019, 8:25 AM
Mholloway moved this task from Backlog to NSFW scoring on the Machine vision board.

Warning: Minefield ahead.

Classifiers are generally quite good when it comes to objective things. Is a person in this image or not?
If something is not safe for work is very subjective and very much dependent on the cultural background. Things considered safe for work in one culture might be completely not safe for work in another culture. Your model can't take that into account and will be biased towards the dominant culture.

On Commons we try to find a balance between not showing unexpected results and not being censored. Previous attempts that disrupted this balance caused quite a bit of drama. Some pointers:

Please be aware of this.

MusikAnimal added a comment.EditedJul 8 2019, 5:58 PM

Hi! A very quick skim of the upstream project suggests me there is no storage need, is this correct?

We'd like to store the scores in a database, as images are uploaded. This way we can very quickly fetch them, say for use in AbuseFilter.

Things considered safe for work in one culture might be completely not safe for work in another culture.

Indeed. I think the idea is to store the NSFW scores in the database, making it available to AbuseFilter and perhaps also an API endpoint. The scores represent a literal interpretation of NSFW, e.g. something graphic that might be deemed offensive (think SafeSearch on Google image results). It will be up to the individual communities on how to make use of this data. For Commons, it probably wouldn't be used at all. On content wikis however, AbuseFilter allows us to use various heuristics to determine context and whether or not the image makes sense for the given article. For English Wikipedia specifically, this would effectively put a stop to our ongoing issue with image vandalism.

To add to what MusikAnimal said, for SDC we're mainly looking at using this information to prevent "unexpected" images showing up in an upcoming tool that will ask users to confirm (or reject) depicts statements suggested by a machine vision system. There are no planned Commons use cases beyond that.

CDanis added a subscriber: CDanis.Jul 12 2019, 4:00 PM

@Joe Thanks (belatedly) for the comments. I've got a fork at https://github.com/mdholloway/nsfwoid where I'm working on making the updates you suggested. I still need to add metrics and logging.

Mholloway renamed this task from Internal deployment of open_nsfw-- image scoring service to Update open_nsfw-- for Wikimedia production deployment.Jul 12 2019, 8:47 PM
Mholloway updated the task description. (Show Details)
Mholloway updated the task description. (Show Details)
Mholloway added a comment.EditedJul 12 2019, 8:54 PM

On the subject of race conditions and accepting raw image data, I imagine the scoring request for newly uploaded images would most likely occur in a deferred update kicked off in the UploadComplete hook, which is only triggered if the upload completes successfully and the image has an upload.wikimedia.org URL. That said, if need be, it should be easy enough to support another POST body param for accepting raw image data rather than a URL.

Mholloway updated the task description. (Show Details)Jul 16 2019, 4:38 PM

I should mention the open_nsfw fails for some images, e.g.

$ curl -d 'url=https://upload.wikimedia.org/wikipedia/commons/b/b5/2009symfiddleii.jpg' https://nsfw.wmflabs.org
500 Internal Server Error

Server got itself in trouble

https://upload.wikimedia.org/wikipedia/commons/6/66/Thol._Thirumavalavan.jpg is another example.

Mholloway added a comment.EditedJul 16 2019, 6:52 PM

@MusikAnimal Since I'm also working on this at the moment, I grabbed a stack trace for the image you linked:

Error handling request
Traceback (most recent call last):
  File "/home/mholloway/.local/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 418, in start
    resp = await task
  File "/home/mholloway/.local/lib/python3.6/site-packages/aiohttp/web_app.py", line 458, in _handle
    resp = await handler(request)
  File "/home/mholloway/.local/lib/python3.6/site-packages/aiohttp/web_urldispatcher.py", line 890, in _iter
    resp = await method()
  File "api.py", line 54, in post
    raise e
  File "api.py", line 46, in post
    nsfw_prob = await score(image)
  File "api.py", line 23, in score
    output_layers=["prob"]
  File "/home/mholloway/code/mdholloway/nsfwoid/classify_nsfw.py", line 57, in caffe_preprocess_and_compute
    img_data_rs = resize_image(pimg, size=(256, 256))
  File "/home/mholloway/code/mdholloway/nsfwoid/classify_nsfw.py", line 30, in resize_image
    imr = im.resize(size, resample=Image.BILINEAR)
  File "/home/mholloway/.local/lib/python3.6/site-packages/PIL/Image.py", line 1890, in resize
    self.load()
  File "/home/mholloway/.local/lib/python3.6/site-packages/PIL/ImageFile.py", line 249, in load
    "(%d bytes not processed)" % len(b)
OSError: image file is truncated (3 bytes not processed)

Edit: The above trace was for the second image above. The first is the same, except it says 7 bytes not processed instead of 3.

@MusikAnimal I noticed that both of the images you linked were full-size originals, so I tried thumbnailed versions, and interestingly, thumbnails work fine in both cases.

$ curl -d 'url=https://upload.wikimedia.org/wikipedia/commons/thumb/b/b5/2009symfiddleii.jpg/640px-2009symfiddleii.jpg' https://nsfw.wmflabs.org
0.0001238075492437929

$ curl -d 'url=https://upload.wikimedia.org/wikipedia/commons/thumb/6/66/Thol._Thirumavalavan.jpg/397px-Thol._Thirumavalavan.jpg' https://nsfw.wmflabs.org
0.003234797390177846

As for what the underlying problem could be, one clue I notice is that both images are transfers/cross-wiki uploads from enwiki.

Heh, there's an easy fix for the truncated image error:

from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

(https://stackoverflow.com/a/23575424)

Updated https://github.com/mdholloway/nsfwoid.

In T225664#5338710, Mholloway wrote:

Heh, there's an easy fix for the truncated image error

Nice! I was going to say we could just use a thumbnail if the image is too big, say over 1000px. But an actual fix is better :) Thanks for looking into it!

Mholloway updated the task description. (Show Details)Jul 17 2019, 8:54 PM
Mholloway updated the task description. (Show Details)Jul 17 2019, 10:55 PM
Mholloway added a comment.EditedJul 17 2019, 11:19 PM

@Joe I've updated the fork at https://github.com/mdholloway/nsfwoid according to your comments and generally to more closely resemble the repo structure of our other services. Please let me know if it looks reasonable, and I'll request a repo to move it into Gerrit.

(Anticipating one concern, I know that the production image should be copying resources from rather than including the build image, I just haven't figured out the magic combo of resources to copy over that results in a working numpy yet; there's a specific .so that isn't found even if it's present on the resulting image and in the same location as in the build variant.)

Tgr added a subscriber: Tgr.Jul 19 2019, 10:35 AM

How is this related to T214201: Implement NSFW image classifier using Open NSFW? It seems unnecessary to do both.

(To memorialize our discussion on IRC:) Discussions shifted in meetings from incorporating the model in ORES (as reflected in that ticket) to running as a standalone web service, since getting it to work in ORES would be a nontrivial amount of work that no one has time to do at the moment, and CommTech wants the functionality urgently. I guess that shift isn't reflected anywhere here on Phab.

In the event that model or something similar were integrated into ORES, then yes, we could shut down this service.

Joe added a comment.Jul 22 2019, 8:08 AM

@Joe Thanks (belatedly) for the comments. I've got a fork at https://github.com/mdholloway/nsfwoid where I'm working on making the updates you suggested. I still need to add metrics and logging.

@Mholloway I took a brief peek and indeed the project now looks in much better shape :)

I have a few questions now:

  • How is this service going to be accessed? Only via some async job or also by the public via its api?
  • If it's public, will it be exposed via restbase?

besides that, we will need more information once you're ready for deployment.

P.S. I guess you already contacted the security team for a review of the code, right? That's mandatory to get a service in production.

Mholloway moved this task from NSFW scoring to Backlog on the Machine vision board.Aug 1 2019, 8:24 PM
Mholloway moved this task from Backlog to Done on the Machine vision board.Aug 1 2019, 9:29 PM
Mholloway closed this task as Resolved.Tue, Oct 15, 10:14 PM

The work encompassed by this task (updating the open_nsfw service) is complete and so I'm going to resolve this ticket. The larger initiative is stalled. @Joe I will be sure to address your questions in a new ticket regarding production deployment if and when we decide to move forward.