Page MenuHomePhabricator

[Dashboard] Paulscore calculation sums duplicated clicks on the same position
Closed, ResolvedPublic

Description

Back on April 18th, @mpopov ping me about a weird paulscore number on dashboard:

Relative paulscore shouldn't be greater than 1. So I checked paulscore_approximations_fulltext_langproj_breakdown.tsv and see:

datelanguageprojectserch_sessionspow_1pow_2pow_3pow_4pow_5pow_6pow_7pow_8pow_9
2017-03-26GermanWikibooks21.51.51.51.51.51.51.51.51.5

When F=0.1, the maximum possible score should be 1/(1-0.1) = 1.11111.

Then I forgot about this problem, but it somehow come back to my head today... So I check EL table and see one of those two sessions had two clicks on the first result, the other session had one click on the first result, and hence the paulscores of these two sessions are 2 and 1. This means we should remove duplicated clicks on the same position for each query.

Query:

SELECT *
FROM TestSearchSatisfaction2_16270835_15423246
WHERE LEFT(timestamp, 8) = "20170326"
  AND wiki = 'dewikibooks'
  AND event_source = 'fulltext'
  AND event_action IN ('searchResultPage', 'click')

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 9 2017, 10:53 PM
chelsyx updated the task description. (Show Details)Aug 9 2017, 11:06 PM

Change 370977 had a related patch set uploaded (by Chelsyx; owner: Chelsyx):
[wikimedia/discovery/golden@master] Remove duplicated clicks on the same position for each query when computing paulscore

https://gerrit.wikimedia.org/r/370977

chelsyx updated the task description. (Show Details)Aug 10 2017, 5:52 PM

Change 370977 merged by Bearloga:
[wikimedia/discovery/golden@master] Remove duplicated clicks on the same position for each query when computing paulscore

https://gerrit.wikimedia.org/r/370977

Thanks, good job!

debt closed this task as Resolved.Aug 17 2017, 6:34 PM
debt triaged this task as Normal priority.