Page MenuHomePhabricator

Newcomer tasks: focus images of people on faces
Open, Needs TriagePublic

Description

The suggested edits module shows a preview image for the articles it suggests. When this image is of a person, it frequently cuts off the person's face. We want to improve the cropping/centering logic so that images of people show their faces. See below for examples of how it currently behaves.

image.png (603×406 px, 138 KB)

image.png (591×418 px, 113 KB)

image.png (596×419 px, 90 KB)

Event Timeline

I spoke with @Miriam about this problem, since her expertise is image recognition. She says this is a comparatively easy problem from a machine learning perspective, and she'll post here some of her thoughts.

Probably not easy enough to do it in real-time though, right? So where would we store it? AIUI storing and accessing structured data is a bit inconvenient now for Commons files, and entirely impossible for non-Commons files. (Although in theory images of living people should all be in Commons, if the wiki follows the non-free content guidelines...)
Probably something to coordinate with the Structured Data team.

Probably not easy enough to do it in real-time though, right? So where would we store it? AIUI storing and accessing structured data is a bit inconvenient now for Commons files, and entirely impossible for non-Commons files. (Although in theory images of living people should all be in Commons, if the wiki follows the non-free content guidelines...)
Probably something to coordinate with the Structured Data team.

In Arabic Wikipedia, images of living people are in Commons, but the fair use is permitted so same images are only in arwiki (for deceased people).

Right, not sure why I assumed this is about living people...

@Tgr face detection is pretty fast, I can compute average detection time on a stat machine but it might be useful to get an idea of how fast is "real time".
In case, would something like a simple API work for this purposes? Something that given an image name, returns the bounding box of the face, if detected?

Also - could Image Annotations be useful for storing bounding box values? I know how to add image annotations, but where/how these are stored is a bit unclear to me.

@Tgr face detection is pretty fast, I can compute average detection time on a stat machine but it might be useful to get an idea of how fast is "real time".

The AQS API takes around 150ms so if it is faster than that, we can spend a parallel request without slowing navigation down.

In case, would something like a simple API work for this purposes? Something that given an image name, returns the bounding box of the face, if detected?

Image URL, probably, as converting an image name to URL can be annoying. Sounds good otherwise.

Also - could Image Annotations be useful for storing bounding box values? I know how to add image annotations, but where/how these are stored is a bit unclear to me.

The image annotator gadgets stores data as a horrible jumble of wikitext templates, definitely not something we want to use. FileAnnotations would be the nice replacement for it in theory but I don't think it's anywhere near usable.

We could either use SDoC (a depicts statement with a qualifier like relative position within image, probably) or make our own storage. The second seems more feasible to me, but worth discussing with the structured data team probably.

@MMiller_WMF - can this task be discussed for prioritising as it may be exacerbated by the fix on T244210 ?

I can bring this up in our meeting with @Miriam next week.

Just wanted to +1 this task and say that iOS could also benefit from an API like this. We currently do client-side face detection in the app, but unfortunately our home screen widgets are too memory-constrained to take advantage. We have also kicked around the idea of saving the bounding boxes of our in-app face detection to the server, so that other users can take advantage of any on-device computing that has already happened (see T98637 for slightly more info).