I realized there can be a problem in the constraint reports that happens for example with all the entities with P449. You can see that the constraint report for Q13407659 detects a warning for the property P449 saying that Q2006738 isn't an instance or a subclass of Q15265344, but it is. You just have to follow the tree and go through it to see that it's an instance of Q1616075, a subclass of 15943455 and finally a subclass of Q15265344. I understand there has to be a limit in the depth of analysis to prevent performance problems, but if the problem is that the limit is reached I don't think the specified depth could be enough for many valid cases.
|mediawiki/extensions/WikibaseQualityConstraints : master||Inject+use SparqlHelper in TypeCheckerHelper|
|Resolved||Lucas_Werkmeister_WMDE||T169326 Type constraints still report false violations|
|Resolved||Lucas_Werkmeister_WMDE||T166379 Value type constraints show errors even when they shouldn't|
- Mentioned Here
- T169326: Type constraints still report false violations
rEBQC0408a71186e3: Fix type/subclass check limit
rEBQC6f24ded76ea4: Add test for DoS vector on TypeCheckerHelper
rEBQC7ac876abcd96: Make MAX_ENTITIES configurable and raise default
T164948: DoS attack vector in the WikibaseQualityConstraints extension
P279 404 and 500 error pages
We replaced the depth limit with a limit on total entities visited a few weeks ago (0408a71186e3); the ticket for it, T164948, isn’t public yet, but the test commit 6f24ded76ea4 has a longer explanation for the issue.
We noticed pretty soon that the new limit was too low and increased it in 7ac876abcd96; that change should be deployed soon, which will hopefully resolve this issue.
We’re also working on an alternative type checker that uses SPARQL, which will have no depth limit (it will be subject to the WDQS timeout, but that should rarely be an issue).
I think the new limit has been deployed (I see type checking spikes of 4 s and 7½ s in the last 12 h on Grafana), can you try it out? (Perhaps unswap the classes if you don’t have any other item where the test previously failed?)
@abian that would be possible, but I don’t want to add too many optimizations like that because eventually we would just be building a shittier, slower BlazeGraph :) the SPARQL change has already been merged, so for now I’d rather focus on integrating that into the regular “type” and “value type” checkers so that false negatives like these should no longer occur.
@Lucas_Werkmeister_WMDE I think it is working now, but I think there may be a problem still in how the value type constraint is checked (or with my understanding of what a type is). It still displays an error when the entity instead of being an instance of a given entity is a subclass of a given entity, anyway, in this case I don't know if the problem is with the constraint itself or with my understanding, but for me if a cars are a subclass of vehicles they are still vehicles (as well as all the instances of the car class).
One example of this behavior is Q1616075, which from my understanding it shouldn't be triggering an error for the Type constraint due to being a subclass of a subclass of organization.
Anyway if it's just a problem of my understanding of what a type is feel free to close the task as resolved.
@Agabi10 I think that’s working as intended – “instance of” and “subclass of” are different things and, in general, shouldn’t be used interchangeably. If some people think that “instance of” should be used and add a constraint to reflect that, and other people think that “subclass of” should be used and add statements with that property, then those people should get in touch and have a discussion about the discrepancy. If the constraint violations make such disagreements more obvious, and lead to more such discussions, that’s a good thing in my opinion :)
In the case of Q1616075, the constraint violation is certainly technically correct, and looking at P452 – “industry of company or organization”, “Wikidata property for items about organizations” – I would say that it’s also appropriate: the property seems intended only for individual organizations, not classes of them.