Page MenuHomePhabricator

property_threshold is behaving in unexpected way
Closed, ResolvedPublic

Description

As reported by @99of9 :

it seems not to correctly handle |property_threshold=1, instead it returns 0% for anything with under 20 instances)

Here's a demonstration of the threshold issue though: https://www.wikidata.org/w/index.php... . When I require the group sizes to be over 2, even the rows about those which are size 3 change. If they're Property item count is under 3, that cell also gets set to 0.

Revisions and Commits

Event Timeline

JeanFred created this task.May 23 2019, 2:17 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 23 2019, 2:17 PM
JeanFred moved this task from Backlog to Bugs on the Tool-inteGraality board.May 23 2019, 2:17 PM

It seems to using grouping_threshold .. I noticed that when I started using a fairly high one.

Susannaanas added a subscriber: Susannaanas.EditedJun 1 2019, 6:25 AM

If I set grouping threshold to 0 the update fails. It also complains about the language label. The property threshold was set to 1.

Something catastrophic happened when procesing page User:Susannaanas/GLAM_Stats_in_Finland.

Please report this on Phabricator.

<class 'KeyError'>

'en'
JeanFred triaged this task as High priority.Oct 30 2019, 7:42 PM

Mentioned in SAL (#wikimedia-cloud) [2019-10-30T19:42:36Z] <wm-bot> <jeanfred> Deploy latest from Git master: 289a41b, 02a1a4f, e8d4363 (T224226)

It seems to using grouping_threshold .. I noticed that when I started using a fairly high one.

Indeed, that was exactly what was happening − see e8d4363

To be honest, I don’t really like property_threshold : as far as I can tell it’s not going to improve performance, and with the output of « 0 (0%) » there is no way of telling whether this is an actual zero, or just below the threshold (if I could I would put an N/A when it’s "ignore" but I’m not sure I can.

I don’t really like property_threshold ... with the output of « 0 (0%) » there is no way of telling whether this is an actual zero

I agree - I cannot imagine a situation where I would need this.

Mentioned in SAL (#wikimedia-cloud) [2019-10-31T09:06:07Z] <wm-bot> <jeanfred> Deploy latest from Git master: 46484f5 (T224226)

I don’t really like property_threshold ... with the output of « 0 (0%) » there is no way of telling whether this is an actual zero

I agree - I cannot imagine a situation where I would need this.

I went ahead and default this value to zero − if people have overridden it then it will continue working. The previous default of 10 is fine with a grouping of 20 (which is the default), but it seems to me that users often will set grouping_threshold very low (as in zero), in which case the result is surprising and even misleading.

@Multichill Could you remind me of the rationale for property_threshold?

Hi Jean-Fred, Integraality is an awesome tool and I'm enjoying using it. I also can't see a need for property_threshold, so I set it to 1, meaning I want numbers to be shown if there is at least one item with that property- that's what a threshold of 1 would mean to me. Instead, it shows cells with at least 2 and the cells with 1 show as 0.0%. This is easy to fix for my case by setting the value to 0, but I just wanted to highlight that, when it is used, the current behaviour is off by one from what I think is the natural expectation.

JeanFred renamed this task from grouping_threshold is behaving in unexpected way to property_threshold is behaving in unexpected way.Nov 2 2019, 2:31 PM

Mentioned in SAL (#wikimedia-cloud) [2019-11-10T15:31:52Z] <wm-bot> <jeanfred> Deploy latest from Git master: ab714de (T224226)

JeanFred added a comment.EditedNov 10 2019, 3:33 PM

Hi Jean-Fred, Integraality is an awesome tool and I'm enjoying using it. I also can't see a need for property_threshold, so I set it to 1, meaning I want numbers to be shown if there is at least one item with that property- that's what a threshold of 1 would mean to me. Instead, it shows cells with at least 2 and the cells with 1 show as 0.0%. This is easy to fix for my case by setting the value to 0, but I just wanted to highlight that, when it is used, the current behaviour is off by one from what I think is the natural expectation.

Thanks for the report @MartinPoulter − I agree that an inclusive comparison is the natural expectation here. Went ahead and made the change.

JeanFred closed this task as Resolved.Dec 2 2019, 3:34 PM
JeanFred claimed this task.

Closing as resolved − please file more tickets if there are other issues :)