Page MenuHomePhabricator

[EPIC] Refine WDQS queries analysis
Closed, ResolvedPublic

Description

The current analysis parses queries and extracts:

  • Operators (list, and map with number of usage)
  • Nodes (variables, URIs, literals, blanck nodes) map with number of usage
  • Prefixes (map with number of usage)
  • Services (map with number of usage)
  • Wikidata names (URIs with main value matching regex "^[QP]\\d+$")
  • Expressions
  • Paths

The values used to identify operators, expressions, path or nodes are string, either the detailed name (for operators or nodes for instance), or the full print of the subtree portion (for path or expressions for instance).

One thing we badly miss for our analysis is triple-pattern-matching information: when a triple-pattern is met , which form is it in ( <? - P - O>, <S - P - ?> for instance), and what are the defined value it embeds (URIs, literals etc). With that information we should be able to be more precise in term of triple-pattern usages in queries, possibly also getting a better feel of subgraphs heavily used.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 684346 had a related patch set uploaded (by AKhatun; author: AKhatun):

[wikidata/query/rdf@master] Analyze sparql triple

https://gerrit.wikimedia.org/r/684346

MPhamWMF renamed this task from Refine WDQS queries analysis to [EPIC] Refine WDQS queries analysis.Jun 24 2021, 1:39 PM
MPhamWMF moved this task from Analysis to Epics on the Wikidata-Query-Service board.