Constraints are an integral part of keeping Wikidata's data quality high. We need to understand them more in order to see how they can be improved further.
@abian is looking into how constraints are defined and used right now on Wikidata: https://meta.wikimedia.org/wiki/Grants:Project/Rapid/Abi%C3%A1n/Study_on_Wikidata_property_constraints
As part of this task we want to better understand the constraint violations - so the cases where data does not conform with the constraint definition. Among others we want to learn more about:
- How many violations are there? How does it develop over time?
- Are violations clustered around certain datatypes or properties?
- Do violations get fixed? By hand or mass-edits?
- What percentage of violations are false alarms (e.g. exceptions or bad constraint definition) and which ones really should be fixed (e.g. legitimate error in the data)?
Notes
- this will probably become easier with T192565 being solved
- https://angryloki.github.io/wikidata-constraint-violations/ might be useful