Currently we use autolist2 to get a bunch of items in a category, combine that with a query and work on the intersection. The category might give several hundreds of items, but the query is everything with either P31 or P279 so the result is huge (for nlwiki about 1.8 million items). This makes it very slow, heavy and times out every once in a while. Take for example https://tools.wmflabs.org/autolist/?language=nl&project=wikipedia&category=Motorfietstechniek&depth=0&wdq=&pagepile=&wdqs=SELECT%20%3Fitem%20%0AWHERE%0A%7B%0A%09%3Fsitelink%20schema%3Aabout%20%3Fitem%20.%20%3Fsitelink%20schema%3AinLanguage%20%22nl%22%20%0A%20%20%20%20.%20%7B%20%3Fitem%20wdt%3AP31%20%3Fp31%20%7D%20UNION%20%7B%20%3Fitem%20wdt%3AP279%20%3F279%20%7D%0A%7D&statementlist=P&run=Run&mode_manual=or&mode_cat=and&mode_wdq=not&mode_wdqs=not&mode_find=or&chunk_size=10000
Getting pages in category tree... 263 pages found. Getting corresponding Wikidata items... 263 items found. Getting WDQS data... 1,871,517 items loaded. Combining datasets... After OR : 0 items. After AND : 263 items. After NOT : 251 items. 251 items in combination. Query took 118.65857410431 seconds. 0.5 MB memory used.
The other way around would be better. Do a query to get all items that have a sitelink to some wiki, but no statements. This times out. We discussed this on irc and one solution is to add a new triple to store the number of statements (and the number of sitelinks while we're at it). That way we can just query for that new triple. For that http://wikiba.se/ontology-1.0.owl needs to be expanded.