Page MenuHomePhabricator

HAVING in named subquery results in “non-aggregate variable in select expression” error
Closed, ResolvedPublic

Description

Reduced query (not meaningful):

SELECT ?x WITH {
  SELECT ?x (COUNT(*) AS ?count) WHERE {
    ?x a schema:Dataset.
  }
  GROUP BY ?x
  HAVING(?count > 0)
} AS %x WHERE {
  INCLUDE %x.
}

(link)

java.lang.IllegalArgumentException: Non-aggregate variable in select expression: x

If you remove the HAVING clause, the query is accepted.

Phrased as a non-named subquery, this works without problems:

SELECT ?x WHERE {
  {
    SELECT ?x (COUNT(*) AS ?count) WHERE {
      ?x a schema:Dataset.
    }
    GROUP BY ?x
    HAVING(?count > 0)
  }
}

(link)

It seems the subquery has to specify a triple to trigger the bug; I was unable to reproduce it with a VALUES clause instead. (The triple can also be ?x ?y ?z to make the query completely independent of the database, but then you’ll want a LIMIT on the subquery.)

Here’s the original query where I found the bug:

# works with most citations
# runtime: 25-45 s
SELECT ?work ?workLabel ?count WITH {
  SELECT ?work (COUNT(*) AS ?count) WHERE {
    ?work wdt:P2860 [].
  }
  GROUP BY ?work
  HAVING(?count > 100)
  ORDER BY DESC(?count)
  LIMIT 100
} AS %works WHERE {
  INCLUDE %works.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?count)

(link)

This is probably an upstream issue, but BlazeGraph’s Jira seems to be down right now and I don’t have an account there anyways.

Event Timeline

Restricted Application added a project: Discovery. · View Herald TranscriptMay 17 2017, 8:50 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Jun 11 2017, 5:52 PM
Smalyshev triaged this task as Medium priority.Jun 23 2017, 8:13 PM
Smalyshev added a project: Upstream.

Change 533107 had a related patch set uploaded (by Igor Kim; owner: Igor Kim):
[wikidata/query/blazegraph@master] Fix non-aggregate variable in HAVING of named subquery

https://gerrit.wikimedia.org/r/533107

Change 533107 merged by Smalyshev:
[wikidata/query/blazegraph@master] Fix non-aggregate variable in HAVING of named subquery

https://gerrit.wikimedia.org/r/533107

debt closed this task as Resolved.Sep 5 2019, 6:45 PM
debt claimed this task.
Lucas_Werkmeister_WMDE reopened this task as Open.Oct 3 2019, 3:42 PM

I can still reproduce this with the exact query given in the task description (and with this other one, for what it’s worth).

debt removed debt as the assignee of this task.Mar 2 2020, 6:43 PM
debt added a subscriber: debt.
dcausse closed this task as Resolved.Mar 4 2020, 1:46 PM
dcausse assigned this task to Igorkim78.
dcausse added a subscriber: dcausse.

Tentatively closing, the example provided in T165559#5544268 seems to work.