Page MenuHomePhabricator

List of all constraint violations
Open, NormalPublic

Description

We want to provide a global list of constraint violations: some special page where you can ask for all violations (across all entities) of a certain constraint, or of all constraints of a certain property. (The last case is similar to the current Wikidata:Database reports/Constraint violations created by KrBot.)

The idea is to implement this on top of T179849: Cache all constraint check results per-entity: we cache constraint violations in a database table, along with information which constraint violations are recorded in each row, and then the special page can retrieve all violations of constraints from that table. Initially, this only includes results from entities where constraint checks were requested at some point, but we can probably address that later.


Old content of this ticket (to which Lydia’s comment is a response):

Do we want to provide a global list of constraint violations? I. e., something like the existing Wikidata:Database reports/Constraint violations created by KrBot.

If we want to do this, it will probably take a while before we can implement it, but I think this is a factor we need to consider when deciding on how to implement T179839: Cache constraint check results, so we should consider it now. In that case, we should probably store cached check results in the database instead of in memcached, and also index them on the necessary fields so we can find them later.

Event Timeline

Restricted Application added a project: Wikidata. · View Herald TranscriptNov 15 2017, 11:33 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Yes that should be available at some point.

Okay, thanks!

But I’m already starting to second-guess myself on the connection of this task to T179839: Cache constraint check results :D because there’s a huge difference between caching constraint check results and storing constraint violations. If we want to use the same storage backend for them, that would mean storing some fifty or more¹ “compliance” results for every entity, which is probably way too much to be feasible.

¹ ~8.5 average statements per item × ~4 average constraints per property, plus some extra for future growth, the fact that more commonly used properties likely have more constraints than average, and qualifiers and references.

Lucas_Werkmeister_WMDE renamed this task from List of all constraint violations? to List of all constraint violations.Nov 20 2017, 5:07 PM
Lucas_Werkmeister_WMDE updated the task description. (Show Details)
thiemowmde triaged this task as Normal priority.Dec 11 2017, 10:01 AM