Page MenuHomePhabricator

Correct uniques computation to not exclude countries that don't have either underestimates or offset
Closed, ResolvedPublic3 Estimated Story Points

Description

Currently we join underestimates and offset using an INNER JOIN, meaning we lose data in case there is no underestimates or no offsets.

These is of little effect on data as it only affects projects with a very small number of uniques (and we know the data quality is lower for uniques <1000 so we recommend it is not used for those). See: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Unique_Devices/Last_access_solution#Data_Quality_Analysis

Event Timeline

Change 354214 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Correct last uniques oozie jobs (wrong join)

https://gerrit.wikimedia.org/r/354214

JAllemandou set the point value for this task to 3.
JAllemandou moved this task from Next Up to Ready to Deploy on the Analytics-Kanban board.

Change 354214 merged by Joal:
[analytics/refinery@master] Correct last uniques oozie jobs (wrong join)

https://gerrit.wikimedia.org/r/354214

Change 355387 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Correct last_access_uniques daily/monthly bug

https://gerrit.wikimedia.org/r/355387

Change 355387 merged by Joal:
[analytics/refinery@master] Correct last_access_uniques daily/monthly bug

https://gerrit.wikimedia.org/r/355387