Page MenuHomePhabricator

[Task] Only query constraints table once per property
Closed, ResolvedPublic

Description

Currently, we query the constraints table for the constraints on a property every time we check constraints for a property. If an entity has multiple statements with the same property, or if an API request checks several entities at once, this can results in several identical, redundant requests. To optimize constraint checking, we should only query the constraints once per property.

Event Timeline

Jonas.keutel raised the priority of this task from to Medium.
Jonas.keutel updated the task description. (Show Details)
Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Feb 16 2015, 5:24 PM
Tamslo set Security to None.
Tamslo moved this task from Backlog to WBQC Backlog on the Wikibase-Quality board.Jun 19 2015, 2:56 PM
Tamslo added a subscriber: Tamslo.Jun 19 2015, 3:42 PM

@Jonas.keutel: Could you clarify this? I don't get the description

Jonaskeutel renamed this task from Group statements to Group statements per property for performance reasons.Jun 20 2015, 8:32 AM
Jonaskeutel updated the task description. (Show Details)
Lydia_Pintscher renamed this task from Group statements per property for performance reasons to [Task] Group statements per property for performance reasons.Aug 17 2015, 4:10 PM

@Lucas_Werkmeister_WMDE Can you check if this task is still valid?

It’s still valid, but I think there are two fairly separate tasks here.

  • We currently query the database for constraints on a property every time we check constraints on a property, which can be several times per request if there are several statements of the same property on an item, or if an API request includes several items. (This currently happens in DelegatingConstraintChecker::checkStatement and ConstraintRepository::queryConstraintsForProperty.) I imagine we could introduce some caching ConstraintLookup.
  • Some constraints, like “has type” (type) or “has statement” (item), are really constraints on the item, not on the statement; if multiple statements all introduce the same constraint on the item (for example, parent, sibling, date of birth, etc. might all require type: human), it only needs to be checked once. This seems like a more fundamental change to me, and unrelated to the first task, which is purely technical.
Lucas_Werkmeister_WMDE renamed this task from [Task] Group statements per property for performance reasons to [Task] Only query constraints table once per property.EditedApr 24 2017, 12:27 PM
Lucas_Werkmeister_WMDE updated the task description. (Show Details)

I’ve created T163683 for the second point and edited this task to reflect the first point.

Change 349949 had a related patch set uploaded (by Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Cache constraints for property

https://gerrit.wikimedia.org/r/349949

Change 349949 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Cache constraints for property

https://gerrit.wikimedia.org/r/349949

Lucas_Werkmeister_WMDE closed this task as Resolved.Apr 24 2017, 3:09 PM
Lucas_Werkmeister_WMDE claimed this task.