The WDQS SPARQL service is fantastic; but it is a shame that it seems to be in such a separate silo from the category information that is used so heavily on Wikipedia (including for maintenance, and to mark topics of interest to given wikiprojects.
Yes there are tools like Magnus's Pet scan which allow one to combine the results of searches in the two silos.
But it would be nice if one could access category information more directly, from SPARQL itself.
Something I think would be a great addition to WDQS would be a SERVICE that would take a category name (or perhaps wikidata category item, or a list of them) and a wikipedia language code, and replace it with a VALUES list of all the items in that category in that wikipedia.
I could imagine this running like a preprocessor directive, doing a straight substitution before the query is passed on to BlazeGraph. (Or at whenever would be the most efficient stage to pass in such a list).
Refinements could include also looking at categorisation of talk pages (since eg on en-wiki it is the talk pages that are categorised to show the interest of a particular wiki-project in a page); or allowing a specified recurse depth.
Other services of a similar nature that I believe would also be useful would be
- a SERVICE to return the items for all pages that have a particular template transcluded on them in a particular wiki
- a SERVICE to return a graph of all of the categorisations for a particular item across the different wikis, eg ?item some_prefix:has_categorisation ?cat . ?cat some_prefix:category ?item2 . ?cat schema:isPartOf ?wiki
The latter might be used eg as part of a query to try to identify categorisations that cannot (yet) be 'explained' by the properties currently on the object.