Page MenuHomePhabricator

Good faith test does not work properly
Closed, ResolvedPublic

Description

  1. Do a search with only the Human filter on.
  2. Highlight "Very likely good" green
  3. Highlight "May be bad" yellow
  4. Highlight "Likely bad" blue

Expected result: all results that are blue (Likely) should also be yellow (May), because Likely is a subset of May. There should probably be some green and yellow blends, since these two definitions overlap. But green and blue should not blend, because the two definitions do not overlap.
Actual result: there are many blue and green blends (which is impossible). And there is one result that has all three colors (which is impossible), but other than that, no yellow-green blends.
(see screenshot below)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 23 2017, 8:57 PM

This is because we're extracting the thresholds incorrectly from the ORES API: the thresholds for "false" qualities need to be inverted.

Change 344553 had a related patch set uploaded (by Catrope):
[mediawiki/extensions/ORES@master] Stats: Invert "false" thresholds so they're correct

https://gerrit.wikimedia.org/r/344553

Change 344553 merged by jenkins-bot:
[mediawiki/extensions/ORES@master] Stats: Invert "false" thresholds so they're correct

https://gerrit.wikimedia.org/r/344553

Change 344555 had a related patch set uploaded (by Catrope):
[mediawiki/extensions/ORES@wmf/1.29.0-wmf.17] Stats: Invert "false" thresholds so they're correct

https://gerrit.wikimedia.org/r/344555

Change 344555 merged by jenkins-bot:
[mediawiki/extensions/ORES@wmf/1.29.0-wmf.17] Stats: Invert "false" thresholds so they're correct

https://gerrit.wikimedia.org/r/344555

Mentioned in SAL (#wikimedia-operations) [2017-03-23T23:12:46Z] <thcipriani@tin> Synchronized php-1.29.0-wmf.17/extensions/ORES: SWAT: [[gerrit:344555|Stats: Invert "false" thresholds so they are correct]] T161250 (duration: 00m 52s)

Etonkovidova added a comment.EditedMar 24 2017, 10:09 PM

Checked on enwiki (wmf.17).
The screenshot below shows the same selection of filters as reported:

  • no records with three dots
  • no records with combination "Very likely good faith" and "Likely bad faith".

Note

  • the two records at the top of the list do not display highlighting. There is a delay (~1min) in applying highlighting to the scored results. If you refresh the RC page, the most recent records won't have highlighting - should we optimize it or somehow alert users that it takes some time to see highlighting applied?
  • there are two overlaps -between "Very likely good faith" and "May be bad faith", and between "May be bad faith" and "Likely bad faith". Per our conversation with @Catrope, I filed T161275: ORES thresholds and overlaps optimization

QA Recommendation: Resolve

jmatazzoni closed this task as Resolved.Mar 27 2017, 3:13 PM