Page MenuHomePhabricator

Analysis: Property usage by items' P31
Open, Needs TriagePublic

Description

As a WDQS user, I don't want a large unrelated subgraph to affect the performance of my query when there is a shared property, so that my seemingly simple queries don't time out or take a long time to complete.

It is interesting to understand how properties are used by different content subgraphs (for instance humans, scholarly articles etc). It would allow us to better understand how properties used in a certain query context can be affected performance-wise by their usage in other contexts. For instance, the main-topic property when used for books could suffer from the property being widely used for scholarly-articles (a huge subgraph).
This analysis would use the P31 values of items to try to cluster items into groups (maybe we could even be better in using P279?), and we would count property usage by group to do further analysis.