Page MenuHomePhabricator

Provide access to image sizes from WDQS SPARQL
Open, LowPublic

Description

A user has posted in Wikidata Project Chat that it would be useful to be able to access the sizes of images that are the objects of image-valued properties like P18 (image) and other such properties, in order eg to identify particularly small ones for quality-control purposes.

Clearly the ultimate resolution to this problem is likely to depend on the final design of Stuctured Data for Commons; but in the meantime perhaps it would be possible to create a SERVICE, invoked at a similar stage in the query to the label look-up service, that could retrieve such information via the mediawiki API (T148245: Explore making WDQS access mediawiki API)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

(Got the diff wrong, but the next paragraph shows why Alexmar983 would find this useful).

Images are not any separate entities right now, just links. If we want to add information about them, we can (as information about the entity identified by commons URL) but we'd need to look into existing ontologies I think. I'm pretty sure somebody already used RDF to capture media information, so we don't need to reinvent the wheel here.

Just as not to create a separate tast for the sake of asking a question: if this is possible then do I understand it correctly that support for such things as sizes of pages behind sitelinks and other their metadata access could be added too?

do I understand it correctly that support for such things as sizes of pages behind sitelinks and other their metadata access could be added too?

It is technically easy to add anything that is in page_props (on wikidata), and is slightly harder but still not too hard to add any information that is associated with the page entity and immediately available when exporting to RDF. Adding information about other pages - e.g. pages on another wiki, not wikidata - is harder because we'd have to get info from a different wiki, which is all different can of worms.

@Abit is this something that is now supported on Commons via Structured Data?

You can also do this via MWAPI now (query link):

SELECT * WHERE {
  BIND(wd:Q42 AS ?item)
  ?item wdt:P18 ?image.
  BIND(STRAFTER(wikibase:decodeUri(STR(?image)), "http://commons.wikimedia.org/wiki/Special:FilePath/") AS ?fileTitle)

  SERVICE wikibase:mwapi {
    bd:serviceParam wikibase:endpoint "commons.wikimedia.org";
                    wikibase:api "Generator";
                    wikibase:limit "once";
                    mwapi:generator "allpages";
                    mwapi:gapfrom ?fileTitle;
                    mwapi:gapnamespace 6; # NS_FILE
                    mwapi:gaplimit 1;
                    mwapi:prop "imageinfo";
                    mwapi:iiprop "dimensions".
    ?size wikibase:apiOutput "imageinfo/ii/@size".
    ?width wikibase:apiOutput "imageinfo/ii/@width".
    ?height wikibase:apiOutput "imageinfo/ii/@height".
  }
}

Though it requires one API call per file (it abuses the allpages generator since MWAPI doesn’t directly let you specify the titles parameter), so it’s not very efficient. (Depending on where you get the images from, you may be able to use a better generator for the API request.)