SPARQL queries involving sitelinks are very slow, to the point that it is often impossible to write a query involving sitelinks without the WDQS service timing out.
For example this query, that tries to count the number of Wikidata category-items that have Commons sitelinks, does not complete:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
SELECT (COUNT(DISTINCT ?sitelink) AS ?count) WHERE {
?item wdt:P31 wd:Q4167836 .
?sitelink schema:about ?item .
?sitelink schema:inLanguage "en" .
FILTER (STRSTARTS(str(?sitelink), "https://commons.wikimedia.org/")) .
}In contrast, a similar-sized query that does not involve sitelinks completes without trouble:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX schema: <http://schema.org/>
SELECT (COUNT(DISTINCT ?commonscat) AS ?count) WHERE {
?item wdt:P31 wd:Q4167836 .
?item wdt:P373 ?commonscat
}It would seem that the issue could be resolved by adding new statements to the triplestore, of the form
?item wikibase:hasSitelinkTo wd:Q565
where in this case Q565 is the item for Wikimedia Commons