Page MenuHomePhabricator

Show all columns in the result
Open, MediumPublic

Description

As an editor I want to see all values that influence my query result in order to understand what ends up in my query result.

Problem:
We currently only show the Item ID and Label for the resulting Item in the query results view, no additional information. We should show all Property values that are a part of the query.

Example:

  • When you query for instance of human and occupation poet and has any birthdate -> show columns for Item ID, Label, occupation and birthdate

BDD
GIVEN
AND
WHEN
AND
THEN
AND

Acceptance criteria:

  • A column is added to the query result for each Property that is part of the query

Notes:

  • We are not grouping Items that are matching for multiple OR conditions for now.
  • Removal of columns that have the same value for all results will happen in T271296.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Lydia_Pintscher set the point value for this task to 5.
Lydia_Pintscher moved this task from Incoming to Ready to pick up on the Wikidata Query Builder board.

Example query for literal value. Searches for postal code === 6200:

SELECT DISTINCT ?item ?itemLabel ?value0 WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
  {
    SELECT DISTINCT ?item ?value0 WHERE {
      VALUES ?value0 {
        "6200"
      }
      ?item p:P281 ?statement0.
      ?statement0 (ps:P281) ?value0.  
    }
    LIMIT 100
  }
}

Example Query for any value:

SELECT DISTINCT ?item ?itemLabel ?value0 WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
  {
    SELECT DISTINCT ?item ?value0 WHERE {
      ?item p:P281 ?statement0.
      ?statement0 (ps:P281) ?value0.  
    }
    LIMIT 100
  }
}

The VALUES solution is not very easily adaptable to multiple queries and the team has decided to use BIND instead:

Example searching items with property postal code === 6200 and country === Belgium (Q31)

SELECT DISTINCT ?item ?P281_0 ?P17_1 WHERE {
      BIND("6200" AS ?P281_0)
      ?item p:P281 ?statement0.
      ?statement0 (ps:P281) ?P281_0.
      BIND(wd:Q142 AS ?P17_1)
      ?item p:P17 ?statement1.
      ?statement1 (ps:P17/(wdt:P279*)) ?P17_1.
}
LIMIT 3

Subqueries could be used to search for items 'not matching'

We still need to find out if subqueries are supported by sparqljs

SELECT ?item ?itemLabel ?instance WITH {
  SELECT DISTINCT ?item ?instance WHERE {
    ?item p:P281 ?statement0.
    ?statement0 (ps:P281) ?instance.
    MINUS { ?item (p:P281/ps:P281) "1350". }
  }
  LIMIT 100
}
AS %results WHERE {
  INCLUDE %results.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
LIMIT 100

The current solution for 'not matching' queries is causing timeouts. See bug report

Change 677597 had a related patch set uploaded (by Guergana Tzatchkova; author: Guergana Tzatchkova):

[wikidata/query-builder@master] Show all columns in the result

https://gerrit.wikimedia.org/r/677597

We still need to find out if subqueries are supported by sparqljs

Regular subqueries (standard SPARQL) are, named subqueries (Blazegraph extension) aren’t. SPARQL.js#43

Example query for union (OR) of two conditions with different properties: Link to query service

  SELECT DISTINCT ?item ?P281 ?P17 WHERE {
  {
    BIND("10777" AS ?P281)
    ?item p:P281 ?statement0.
    ?statement0 (ps:P281) "10777".
  }
  UNION
  {
    BIND(wd:Q147 AS ?P17)
    ?item p:P17 ?statement1.
    ?statement1 (ps:P17) ?P17.
  }
  OPTIONAL { ?item (p:P281/ps:P281) ?P281 . }
  OPTIONAL { ?item (p:P17/ps:P17) ?P17 . }
}
LIMIT 4

Example query for union (OR) of two conditions with the same property;

SELECT DISTINCT ?item ?P281 WHERE {
  {
    BIND("10777" AS ?P281)
    ?item p:P281 ?statement0.
    ?statement0 (ps:P281) ?P281.
  }
  UNION
  {
    BIND("10317" AS ?P281)
    ?item p:P281 ?statement0.
    ?statement0 (ps:P281) ?P281.
  }
  OPTIONAL { ?item (p:P281/ps:P281) ?P281 . }
}
LIMIT 50

Change 678562 had a related patch set uploaded (by Guergana Tzatchkova; author: Guergana Tzatchkova):

[wikidata/query-builder@master] Show all columns in the result

https://gerrit.wikimedia.org/r/678562

Change 678562 had a related patch set uploaded (by Guergana Tzatchkova; author: Guergana Tzatchkova):

[wikidata/query-builder@master] Show all columns in the result

https://gerrit.wikimedia.org/r/678562

The real patch is the original one. This is a mistake.

https://gerrit.wikimedia.org/r/c/wikidata/query-builder/+/677597/

Change 678562 abandoned by Guergana Tzatchkova:

[wikidata/query-builder@master] Show all columns in the result

Reason:

duplicate of I08c1887380c2a1d8764613e65a126604cb48aac0

https://gerrit.wikimedia.org/r/678562

This was picked up in storytime for camp with the estimation removed.
In the TODO column for now as the state in story time was not known (should this be in doing or review etc?)
Waiting for @guergana.tzatchkova etc to be back :)

This was picked up in storytime for camp with the estimation removed.
In the TODO column for now as the state in story time was not known (should this be in doing or review etc?)
Waiting for @guergana.tzatchkova etc to be back :)

The patch should be merged since it's getting too big.

I would suggest either adding subsequent patches after the current one is made.
Or breaking down into the remaining tasks:

  • Fixing the optionals for unions (conditions joined by OR) when there are more than two conditions.
  • Adusting the variable names when there are two columns with the same property. Right now they are indexed by property index, (if we have three conditions, with one repeating it would be something like P17_0, P281_1, P17_2) but in some(?) cases they need to be unified to have the same variable name, like P17_0 (would serve P17_0 and P17_2) , P281_1

I think seperate tickets could be good so that they can come into the campsite through the polish & story time process.

@Lydia_Pintscher after some discussion we decided this ticket should go back to Storytime for more clarifications.

Questions we have so far:

  • Should the columns be named after the property label or the property id? (showing the label is more work)
  • What is the limit (if any) of columns we want to show in the result? The UI has no limitation of how many conditions can be added.
  • Should we display a column for the "without" case? If yes, should it indicate it's for a "without" property so it's not confusing for the user which is "with" and which is "without"?
  • We are assuming when there are two occurrences of the same property we combine them in one column. Is that correct?
  • How should we combine properties when we have a complex query with a few conditions, e.g. "and", "and", "or" with a repeated property? Can we have some examples?

All very good questions. @amy_rc and I need to think about this more.