- It's not open source
- WMDE puts higher priority on being able to access violations through SPARQL/API (T214362), but that still needs to complete various tech tasks (eg T201150)
So this task asks to reimplement (something like) KrBot using the new SPARQL/API access.
KrBot generates violation reports like Wikidata:Database_reports/Constraint_violations/P2088 that are integrated in Property Discussion pages and are viewed as core part of WD.
- These pages are the best way to work out data quality problems of specific props. Eg I'm now working out through #Single_value%22_violations to remove stale or wrong CrunchBase identifiers
- Even when I can get all violation info with SPARQL, I'd prefer to work from a generated WD page because:
- all the info is available at a glance,
- it can be used by non-tech people (eg Getty Vocabulary Program editors will now use ULAN constraint violations to improve their own data)
- I can use it to generate QS corrections.
- (There is Special:ConstraintReport, eg ConstraintReport/Q389336 shows some P2088 violations of that item, but big-data editors don't fix data problems item by item.)
- An improvement is needed: print the labels of WD items in addition to Qnnnn
- T201150#7351510 discusses potentially useful schedules for when to reprocess (though that's per-item not per-property)
- A benefit of a SPARQL/API based bot is that violation pages can easily be refreshed on demand