[EPIC] Image-positioning service for storing and retrieving image focal points
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	Maryana
	Feb 5 2015, 12:34 AM

Description

(Moving our mobile tech discussion to Phab to let the Services folks see it and chime in...)

Readers love the new lead banner images in the latest Android release of the Wikipedia app[0] – but currently, if we wanted to port this design change to the mobile or desktop site, images would be cropped randomly and most of the images of people would have their heads chopped off. No bueno.

The Apps team has used native face-detection libraries to avoid the chopped-off-head problem, and Max from the mobile web team found a library[1][2] that we could use to build a generalizable face-detection service for everyone (apps, mobile web, desktop). In the short term, this would unblock the mobile web team from releasing a design update to give our users more parity between our two mobile experiences (apps and mobile web). This would also unlock our ability to evolve the design of desktop lead images, too, replacing the templates that projects like WikiVoyage are using to create banner images (which are static and pretty broken on mobile).[3]

Face-detection would get us 9/10th of the way to a good user experience, but even in the apps there are currently still some edge-case issues with cropping and positioning, so folks from the mobile teams have also discussed a more general image-positioning service. I'm not one to let the perfect be the enemy of the good, though, so if we could just request help from the Services team to conquer the chopped-off-head problem, I'd be very grateful ;)

[0] https://play.google.com/store/apps/details?id=org.wikipedia&hl=en
[1] https://android.googlesource.com/platform/external/neven/+/master
[2] https://github.com/lqs/neven
[3] https://en.wikivoyage.org/wiki/Template:Pagebanner

Related Objects
Search...

Status	Subtype	Assigned	Task
Declined		None	T88633 [EPIC] Image-positioning service for storing and retrieving image focal points
Open		None	T91683 Allow editors control of the page image
Open		None	T95026 PageImages should be able to pick Wikidata's P18 as chosen image
Resolved		Jdlrobson	T152252 PageImages maintenance script should be more flexible
Resolved	BUG REPORT	Jdlrobson	T301588 Allow exclusion of certain page images
Open		simon04	T372221 Exclusion of certain page images does not work – notpageimage does not work on <figure>
Resolved	Spike	KSiebert	T319559 [8 hours] Create list of next steps and blockers on on select PageImage

Event Timeline

Maryana created this task.Feb 5 2015, 12:34 AM

Maryana raised the priority of this task from to Needs Triage.

Maryana updated the task description. (Show Details)

Maryana added projects: Services, Web-Team-Backlog.

Maryana added subscribers: Maryana, MaxSem, Dbrant and 5 others.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 5 2015, 12:34 AM

Jdforrester-WMF subscribed.Feb 5 2015, 12:36 AM

@GWicke @Jdouglas @mobrovac Thoughts from you guys? Does this seem like a use case that a Node.js service and RESTBase could support?

Definitely! Let's put it in the hopper.

Is there a way to designate this task as a backlog item for RESTBase, other than to just tag it as RESTBase?

To answer my own question: yes.

https://phabricator.wikimedia.org/project/board/833/

Additional use-case to consider: sometimes the lead image is really not appropriate as a banner image (think scary medical conditions you don't want to see close up). It would be great if this service could support having new images pushed to it, so users could select a better lead image from all the ones in the article or on Commons without having to edit the page :)

My dream scenario
This data would be associated with the images themselves (via CommonsData?) rather than being tied to a service or specific use case.

Here's what I imagine it would look like:
Assume Commons images had a "focal area" concept which comprised a rectangle* and an optional article title (omitting lang to simplify the example):

{"rect": [0.20, 0.20, 0.12, 0.12], "title": "dog"}

( * Unit rectangle coordinates mean they can be easily applied to any size variant of the image. Note the rect coordinates above denote [origin x, origin y, width, height] )

A given image can, or course, have more than one of these focal areas, so imagine a picture of Obama watching a dog chase a cat:

[
     {"rect": [0.20, 0.20, 0.12, 0.12], "title": "dog"}, 
     {"rect": [0.60, 0.20, 0.22, 0.30], "title": "cat"}, 
     {"rect": [0.10, 0.80, 0.12, 0.30], "title": "Barack Obama"} 
 ]

So a get/set-able "focalareas" array property would be super cool:

"focalareas":[
    {"rect": [0.20, 0.20, 0.12, 0.12], "title": "dog"}, 
    {"rect": [0.60, 0.20, 0.22, 0.30], "title": "cat"}, 
    {"rect": [0.10, 0.80, 0.12, 0.30], "title": "Barack Obama", "isFace": true; "isMainFocus": true}, 
]

( Note: "rect" could instead be named "region" and "focalareas" could be "subregions", if people like that better. )

With "isFace" faces would be easy to distinguish.

With "isMainFocus" an image's primary focal area would be easy to specify - it may or may not be a face, of course, but only one focal area per image can have "isMainFocus" set to true.

Ease of use
With this approach it would be super easy for desktop, mobile or apps to clip (css "clip" property for web) any size variants of an image intelligently without the need to maintain variant specific data or literally cropped image binaries.

If the "focalareas" array could be stored such that entries were sorted in area-descending order, it would be even better - you'd always know the biggest focal area is first in the array and the smallest is last.

Edge cases
An optional qualitative flag (such as "isBadCoverImage") could be used to address edge case surgery/etc images, but this is a property of the overall image and as such is probably a candidate for a CommonsData property of the image, not a flag on an image sub-region.

Apps interfaces for editing regions or curating recently modified regions
Apps specifically could very quickly mock up super simple interfaces for experimenting with user editing of these focal areas if we had a way to store/retrieve such data. Think simple pinch-zoom-drag adjustment of translucent focal area overlays with one-tap to search and select an article "title" associated with the tapped focal area. An app interface for quickly curating and reviewing recently modified image focal areas would be easy too.

Web interfaces too
Web interfaces for the same, especially if informed by learnings from app proofs-of-concept, should be fairly simple.

Better Commons search
Commons image search could also be enhanced to (optionally) search against this richer dataset for better matches. Think an option to restrict searches to matching regions in images - i.e. against titles associated with image focal areas - "dog cat obama", from the example above.

I see several requirements here:

some service or library for face detection / other alignment inference
a place to store image positions, in a way that
- is updated when images are re-uploaded
- makes it convenient to retrieve alignment along with other image info (prop=imageinfo?)

From a logical grouping perspective I would think that whatever code does our image scaling would be best placed to integrate the image position stuff at a higher level. The calculation itself could well happen in a discrete service, but I'm hesitant to store alignment information separately from the other image information without a good reason & a good plan for keeping it up to date. The move to content hash-based image urls can help with the update problem even if stored externally, but the query part still remains.

For the lead image, we currently don't have a great way to store page properties persistently in a way that survives a re-parse. This is an area where restbase can potentially help, but we would again need to make sure that this information is properly updated / degrades well if the stored lead image preference is removed from the page or outright deleted.

More generally, I am a bit hesitant to start storing random bits of separate metadata per page without having a better idea for how we plan to organize page metadata in the longer term. Maybe a collection of random blobs is fine, but it might also pay off to think a bit about the general update and query requirements. There is also the idea to move wikitext-encoded page properties like categories, behavior switches etc to its own separate blob (see T55508).

If we don't get a commons-wikibase setup or something to store the data in, I recommend entering the data into the pages via a parser function, just as we do with coordinates in GeoData.

This could store to page_props and would survive reparses, and would already give you versioning, ability to revert, etc.

Could then migrate it along with other things to wikibase or whatever Commons ends up with for versioned media metadata...

• bmansurov moved this task from Incoming to Upcoming on the Web-Team-Backlog board.Feb 6 2015, 6:12 PM

• GWicke moved this task from Backlog to Unnamed Column on the Services board.Mar 17 2015, 8:07 PM

... and makes it convenient to retrieve alignment along with other image info (prop=imageinfo?)

IMO this is a separate concern and we should stick to solving the root problem: helping the user frame the article's lead image (for a given platform/device?). There also seems to be a desire for the user to select (or upload?) a better lead image.

Maybe there's a separate service which suggests focal areas to assist user cropping, but AFAICT we've determined that tweaking will need to happen regardless. Also, we can already do some of this on (native) clients. Any aggregation w/ articles or imageinfo data can happen downstream.

• KHammerstein subscribed.Mar 25 2015, 6:06 PM

I like Brion's idea of implementing this as a parser function (at least until Commons has some solution for storing image metadata).

@Mhurd I think having a method to just get a single aggregated rectangle, which represents the outer bounds of potentially multiple smaller focal areas (plus some amount of padding), would be good. No need to transfer all the data to the client if it's not interested in the details just to crop a lead image. Let's keep it simple and light-weight for the clients.

Jdlrobson moved this task from Upcoming to 2015-16 Q4 on the Web-Team-Backlog board.Apr 7 2015, 9:47 PM

Jdlrobson moved this task from 2015-16 Q4 to Upcoming on the Web-Team-Backlog board.

Jdlrobson moved this task from Upcoming to 2016-17 Q4 on the Web-Team-Backlog board.

Jdlrobson moved this task from 2016-17 Q4 to 2015-16 Q4 on the Web-Team-Backlog board.Apr 10 2015, 6:09 PM

Jdlrobson moved this task from 2015-16 Q4 to Triaged but Future on the Web-Team-Backlog board.Apr 10 2015, 6:36 PM

• bd808 mentioned this in T98147: Create an API to set page props.May 5 2015, 6:56 PM

• GWicke mentioned this in T102306: Services team roadmap July - September 2015 (Q1 2015/16).Jun 12 2015, 11:27 PM

• Spage mentioned this in T91683: Allow editors control of the page image.Jun 25 2015, 12:13 AM

In T88633#1018899, @brion wrote:

I recommend entering the data into the pages via a parser function, just as we do with coordinates in GeoData.

That's T91683: Allow editors control of the page image, I'll make it a blocking task for this.

• Spage added a subtask: T91683: Allow editors control of the page image.Jun 25 2015, 1:07 AM

So many people struggling because readers want lead images...
When will real issues get the attention they deserve?

Wikivoyage now has lead images and supports a origin parameter
https://www.mediawiki.org/wiki/Extension:WikidataPageBanner#Parameters_to_the_.7B.7BPAGEBANNER.7D.7D_function

A service would greatly benefit making its banners more mobile friendly.

Jdlrobson added a project: Wikidata-Page-Banner.Sep 16 2015, 6:56 PM

Jdlrobson set Security to None.

Restricted Application added a project: Wikidata. · View Herald TranscriptSep 16 2015, 6:56 PM

MaxSem unsubscribed.Sep 16 2015, 7:18 PM

JanZerebecki moved this task from incoming to monitoring on the Wikidata board.Sep 18 2015, 1:19 PM

Jdlrobson renamed this task from Image-positioning service to [EPIC] Image-positioning service for storing and retrieving image focal points.Sep 18 2015, 8:41 PM

• MOUAD2001 awarded a token.Jan 5 2016, 12:00 PM

• MZMcBride subscribed.Jan 26 2016, 7:27 PM

MBinder_WMF moved this task from Triaged but Future to Incoming on the Web-Team-Backlog board.Apr 27 2016, 4:36 PM

Danny_B added a project: Epic.May 6 2016, 7:48 PM

• jhobs moved this task from Incoming to Epics/Goals on the Web-Team-Backlog board.Jul 21 2016, 8:51 PM

• GWicke edited projects, added Services (later); removed Services.Oct 12 2016, 3:36 PM

• GWicke edited projects, added Services (watching); removed Services (later).Oct 12 2016, 8:03 PM

Jdlrobson moved this task from In discussion to Tracking on the Wikidata-Page-Banner board.Nov 14 2016, 12:13 PM

Jdlrobson moved this task from Epics/Goals to 2014-15 Q4 on the Web-Team-Backlog board.Apr 13 2017, 11:08 PM