Page MenuHomePhabricator

[Story] Image thumbnail urls should be included where applicable in wikidata API response for commonsMedia
Open, MediumPublic

Description

In mobile we are experimenting with generating infoboxes via Wikidata API.

Essentially we run this API request:

https://www.wikidata.org/wiki/Special:ApiSandbox#action=wbgetentities&format=json&ids=Q937&sites=enwiki&props=info%7Csitelinks%7Csitelinks%2Furls%7Caliases%7Clabels%7Cdescriptions%7Cclaims%7Cdatatype&normalize=&sitefilter=enwiki

In the response for P109 is the value Albert Einstein signature 1934.svg
There are also other images in the response e.g. Albert Einstein Head.jpg and Albert Einstein (Nobel).png
However it is not obvious how we turn all these titles into thumbnail URLs without requiring additional API requests.

Would it be possible to return a thumbnail in the response via an additional api value

e.g. like so ?

  {
     "mainsnak": {
                          "snaktype": "value",
                          "property": "P18",
                          "datatype": "commonsMedia",
                          "datavalue": {
                              "value": "Albert Einstein Head.jpg",
                              "thumb": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/Albert_Einstein_Head.jpg/160px-Albert_Einstein_Head.jpg"
                              "url": "http://commons.wikimedia.org/wiki/File:Albert_Einstein_Head.jpg",
                              "type": "image"
                          }
                      },
}

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

Jdlrobson renamed this task from Image urls should be included where applicable in wikidata API response for commonsMedia to Image thumbnail urls should be included where applicable in wikidata API response for commonsMedia.
Jdlrobson raised the priority of this task from to Needs Triage.
Jdlrobson updated the task description. (Show Details)
Jdlrobson changed Security from none to None.
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)
Jdlrobson subscribed.

I seem to remember there being an easy way to get the thumbnail url from the full size image url. I'd rather not have that in the api if that is the case. @daniel: IIRC we talked about this?

Currently I have a huge md5 library to do this and I'd rather not resort to that as the code is not reusable enough.

Would be interested to hear other clever ideas.

I seem to remember there being an easy way to get the thumbnail url from the full size image url. I'd rather not have that in the api if that is the case. @daniel: IIRC we talked about this?

There's thumb.php: https://commons.wikimedia.org/w/thumb.php?f=Albert_Einstein_Head.jpg&w=30. But that takes a filename, not a full URL.

i think there is another api module that can provide the thumb url, but question is if/how to combine these results. i don't know that we want this in the core wikibase serialization format (e.g. stored in the database that way) and not sure if that should be an option in getentities itself or done wnother way

i'm having some trouble finding how to get the thumburl, except for use of pageimages prop (provided by https://www.mediawiki.org/wiki/Extension:PageImages)

convinced there must be a way, but maybe not as easy.

For example https://upload.wikimedia.org/wikipedia/commons/thumb/c/c9/Fuerteventura_sunset_leftrocks.JPG/1024px-Fuerteventura_sunset_leftrocks.JPG ???

$uploadUrl/$site/$lang/thumb/$hash1/$hash2/$filename/$width-$filename

Where $hash1 = first char of the hash of the filename and $hash2 is the first 2

Or are we looking for something else? :D

I am going to close this assuming previous comments address the usecase.

Jdlrobson changed the task status from Declined to Invalid.Feb 24 2015, 10:53 PM

Not sure whether thumb.php scales (I assume it caches?) but this will work for the time being. Thanks!

Jdlrobson changed the task status from Invalid to Declined.Feb 24 2015, 10:53 PM
Jdlrobson added a subscriber: MaxSem.

According to @MaxSem use of thumb.php will fry the cluster.

Note that the thumbnail size will need to be selectable through some input variable if you stuff it into one API req like this -- suitable size will depend on the device and how the client software chooses to show thumbs.

(Note another workaround is to go ahead and do a prop=imageinfo request on all the given files as a second API hit. That's two round trips though which is sad, but at least you don't have to do one per image!)

I'm trying to avoid any roundtrips. It's too expensive in this use case :(

Lydia_Pintscher triaged this task as Medium priority.

Ping @MaxSem @brion would this really fry the cluster? Could thumb.php not simply be updated to redirect to the image file itself?

Hmm, well if thumb.php redirected you'd have an extra round-trip plus the overhead of hitting the PHP app servers in the first place... might not be ideal either. :(

(If the redirect gets cached that'd still have the roundtrip, though the first request would be cheaper once the thumbnail's created as we'd avoid hitting PHP app servers.)

The thumbnail URL should probably not be part of the data value as such. I propose to rather have multiple data-values: one for the media id (the file name), one for the image description page, and one for the actual file URL. Perhaps another one for the thumbnail URL, or a thumbnail URL pattern, where the desired size can be filled in.

Getting back to the original comment, I think we should be thinking in terms of creating an "export data package" that can be used by any Wikimedia project or any external party. Think "infobox" for Wikipedia and "image template" for Commons. Each Wikimedia project should have its own import and export wizard for such data packages. In this scenario, the sizing of the image can be something that the consuming project can tweak based on their own "import data package" wizard. Magnus has sort of created an export data package with his "prepbio" tool, but it would be best if this was a two-step process, i.e. 1) export wikidata item to datapackage, followed by import datapackage to English Wikipedia (or to Dutch Wikipedia, or for painting items, to Wiki Commons artwork template, etc). The other way around it would be nice to have a Commons exporter to use for item creation on Wikidata

Jonas renamed this task from Image thumbnail urls should be included where applicable in wikidata API response for commonsMedia to [Story] Image thumbnail urls should be included where applicable in wikidata API response for commonsMedia.Nov 2 2015, 5:16 PM