Many files on Wikimedia Commons which have structured claims cannot be found in Wikimedia Commons Query Service.
One example is https://commons.wikimedia.org/wiki/File:Dux-Markt-1.jpg. It has this concept URI http://commons.wikimedia.org/entity/M47869727 and it has structured data since 30 September 2020. But this query in WCQS has no results:
SELECT * WHERE { sdc:M47869727 ?predicate ?object. }
Many files are missing in WCQS. This query finds some images on Wikidata, and then tries to find the mediaInfo entities in WCQS:
SELECT (COUNT (DISTINCT ?image) AS ?images) (COUNT(DISTINCT ?file) AS ?files) WITH { SELECT ?image ?contentUrl WHERE { SERVICE <https://query.wikidata.org/sparql> { ?item wdt:P31 wd:Q5153359 . ?item wdt:P18 ?image . } BIND (REPLACE(wikibase:decodeUri(SUBSTR(STR(?image), 52)), " ", "_") AS ?filename) BIND (MD5(?filename) AS ?MD5) BIND (URI(CONCAT("https://upload.wikimedia.org/wikipedia/commons/", SUBSTR(?MD5, 1, 1), "/", SUBSTR(?MD5, 1, 2), "/", ?filename)) As ?contentUrl) } } AS %get_some_images_from_Wikidata WHERE { INCLUDE %get_some_images_from_Wikidata OPTIONAL { ?file schema:contentUrl ?contentUrl . } }
The content URLs are constructed as described in https://www.mediawiki.org/wiki/Manual:$wgHashedUploadDirectory and all I tested were correct. But the query gives this result:
images | files |
---|---|
6328 | 1013 |
So only 1013 out of 6328 files used in Wikidata claims can be found in WCQS.