Page MenuHomePhabricator

Article Feedback Page - Relevance Score Math Questions
Closed, ResolvedPublic

Description

The math used to update the relevance score seems strange. In some cases, it doesn't add up. It sometimes appears that instead of incrementing the relevance score with the individual action scores, we are in fact replacing the current score with the individual score.

This needs to be researched more to confirm specific use cases that can be reproduced consistently, but I wanted to start this ticket so we do a bit more rigorous testing on this issue.


Version: unspecified
Severity: normal

Details

Reference
bz36768

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 12:29 AM
bzimport added a project: ArticleFeedbackv5.
bzimport set Reference to bz36768.

reha wrote:

This issue was because the score was being replaced, not updated. Fixed and submitted to gerrit:

https://gerrit.wikimedia.org/r/7728

reha wrote:

Reopened so Fabrice can find it.

reha wrote:

*** Bug 36865 has been marked as a duplicate of this bug. ***

I tested these issues today on both prototype and en-wiki, and continue to find problems with the way we are calculating the relevance score.

I documented the errors I found in three separate tests, which are detailed in the attached spreadsheet and screenshots, as well as summarized below:

  1. Test 1: Existing post on Prototype (#545):
  2. Action 1: Mark as Helpful (+1 point) > Actual score = 1 | Desired = 1
  3. Action 2: Mark as Unhelpful (-1 point) > Actual score = 0 | Desired = 0
  4. Action 3: Flag as Abuse (-5 points) > Actual score = -6 | Desired = -5 (ERROR)
  1. Test 2: New post on Prototype (#546):
  2. Action 1: Mark as Helpful (+1 point) > Actual score = 1 | Desired = 1
  3. Action 2: Mark as Unhelpful (-1 point) > Actual score = 2 | Desired = 0 (ERROR)
  1. Test 3: New post on EN-Wiki (#100,023):
  2. Action 1: Mark as Helpful (+1 point) > Actual score = 1 | Desired = 1
  3. Action 2: Mark as Unhelpful (-1 point) > Actual score = 0 | Desired = 0
  4. Action 3: Flag as Abuse (-5 points) > Actual score = -5 | Desired = -5
  5. Action 4: Featured (+50 points) > Actual score = 50 | Desired = -45 (ERROR)

Reha, please confirm whether or not the version I tested on prototype was the latest -- and whether or not you can reproduce these errors on your end.

In any case, I invite you to use the attached spreadsheet when debugging and testing this relevance score. It has formulas which contain the desired math and make it easy to detect errors in the actual score. Each test has its own columns for each test, with each possible action on a separate row. To enter a test action, add a '1' in that test's 'action' column, which automatically updates the correct score in the 'desired' column -- then refresh the page and add the actual relevance score from the perma-link page. Simply bold discrepancies and highlight them in red.

Created attachment 10605
AFT Relevance Test Matrix

This spreadsheet consists of a test matrix for tracking and comparing relevance score data, all in one place.

It includes automated formulas for calculating the desired score for each action, so we can compare it to the actual relevance score. Feel free to upload it to Google Docs, if you have time, so we can really share this spreadsheet for all future tests.

See attached screenshots for each bug / test number.

Attached:

Created attachment 10606
Test 1: Article-Feedback-Relevance-Bug1-0545.png

Attached:

Created attachment 10607
Test 2: Article-Feedback-Relevance-Bug2-0546.png

Attached:

Created attachment 10608
Test 3: Article-Feedback-Relevance-Score-Bug3-Feature.png

Attached: