Page MenuHomePhabricator

Correct uniques computation to not exclude countries that don't have either underestimates or offset
Closed, ResolvedPublic3 Story Points

Description

Currently we join underestimates and offset using an INNER JOIN, meaning we lose data in case there is no underestimates or no offsets.

These is of little effect on data as it only affects projects with a very small number of uniques (and we know the data quality is lower for uniques <1000 so we recommend it is not used for those). See: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Unique_Devices/Last_access_solution#Data_Quality_Analysis

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 18 2017, 11:13 AM

Change 354214 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Correct last uniques oozie jobs (wrong join)

https://gerrit.wikimedia.org/r/354214

Nuria updated the task description. (Show Details)May 18 2017, 12:49 PM
Nuria edited projects, added Analytics-Kanban; removed Analytics.May 18 2017, 12:53 PM
JAllemandou set the point value for this task to 3.
JAllemandou moved this task from Next Up to Ready to Deploy on the Analytics-Kanban board.

Change 354214 merged by Joal:
[analytics/refinery@master] Correct last uniques oozie jobs (wrong join)

https://gerrit.wikimedia.org/r/354214

Change 355387 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Correct last_access_uniques daily/monthly bug

https://gerrit.wikimedia.org/r/355387

Change 355387 merged by Joal:
[analytics/refinery@master] Correct last_access_uniques daily/monthly bug

https://gerrit.wikimedia.org/r/355387

Nuria closed this task as Resolved.May 30 2017, 10:43 PM