Page MenuHomePhabricator

Constraint warnings not being shown on large Lexemes
Open, Needs TriagePublic

Description

This query lists lexemes with lexeme-level pronunciation audio statements. These should all trigger a constraint violation, because the property has a constraint saying that it's only allowed on items and forms.

The warnings are not being shown on the majority of the Basque lexemes, e.g.

They are being shown on lexemes in other languages:

There's one Basque lexeme where it does work:

The constraint is being triggered even when it's not shown:

The only common factor I can see is that the lexemes where it doesn't work all have a lot of forms. It seems there are multiple wbcheckconstraints requests when there are more than 50 entities to check (since it can only do 50 at a time) - perhaps the second one is overwriting the first?

Event Timeline

For what it is worth, I have definitely noticed there is a limit in lexeme size of some kind where constraint violations no longer appear. I typically do not expect them to appear on Hindustani or Punjabi verbs. On the first sense of a lexeme adding a gloss quote usually results in a constraint violation, but if I add one now to this lexeme with over 50 senses no constraint violation is applied (the lexeme is https://www.wikidata.org/wiki/Lexeme:L33485 but see the screenshot since I do not want to actually leave this statement unreferenced.)

image.png (358×1 px, 29 KB)

(To be honest, it did not occur to me to report this because if these violations did keep coming up it would significantly slow down the load time of the page.)

It seems like you are exactly right about the 50 entity check limit being related to this issue - this lexeme is just shy of 50 entities total with 6 senses and 38 forms, and still shows constraint violations. https://www.wikidata.org/wiki/Lexeme:L1092922

image.png (559×1 px, 43 KB)

I think there’s some weird interaction between $.when and jQuery deferreds going on in gadget.js’s _fullCheckAllIds / _aggregateMultipleWbcheckconstraintsResponses. When there’s only one API request (≤50 entities on the page), the arguments to the aggregate function look like:

{
"0": { "wbcheckconstraints": {...}, "success": 1 } },
"1": { "readyState": 4, "getResponseHeader": ... },
}

where the first argument is apparently an API response (JSON-decoded) and the second argument is some kind of additional response data, which happens to get ignored due to how the method is implemented. But when there’s more than one API request, it instead looks like:

{
"0": [ { "wbcheckconstraints": {...}, "success": 1 } }, { "readyState": 4, "getResponseHeader": ... } ],
"1": [ { "wbcheckconstraints": {...}, "success": 1 } }, { "readyState": 4, "getResponseHeader": ... } ],
}

where we suddenly have an array of these responseData/responseMeta pairs. But the method expects each argument to be one response data, so in this case it doesn’t find any constraint violations (neither argument has a wbcheckconstraints member).

Also, it’s kind of evil that _fullCheckAllIds() makes all the constraint check requests in parallel in the first place. I think we should rewrite this to do the requests sequentially (chain the promises after one another) and then check that this also resolves the buggy merging.

Lydia_Pintscher subscribed.

Is this only happening on Basque Lexemes? That doesn't seem to make sense to me based on what Lucas wrote. Or am I missing something? Is the problem just that Basque Lexemes tend to be more complete and larger and therefor run into the issue?

I think it’s just due to the number of forms. E.g. eraso and tentetu, two Basque verbs, have fewer forms and show constraint violations normally. (eraso is currently affected by T344362, but that’s unrelated.)

Lydia_Pintscher renamed this task from Constraint warnings not being shown on many Basque lexemes to Constraint warnings not being shown on large Lexemes.Sep 8 2023, 1:52 PM
Arian_Bozorg subscribed.

It isn't very clear where the constraint violations aren't appearing, I can still see them on Hebrew lexemes:

I am going to remove it from the dev board for now

Mahir256 subscribed.

It isn't very clear where the constraint violations aren't appearing

Hebrew was mentioned nowhere in this ticket, and the comments from September 2023 seem to support the assertion that lexemes with more than 50 entities on them surface the problem (which none of the three links you provided has).

To provide some more recent examples, using P9970 as a property of inquiry (since it should appear only on senses and not on the top level of lexemes), https://www.wikidata.org/wiki/Lexeme:L1378410 is small enough that the allowed-entity-type constraint of that property is shown, whereas https://www.wikidata.org/wiki/Lexeme:L1379742 is large enough that the same constraint does not appear.