Page MenuHomePhabricator

Portal Dashboard: Add de-duplication step to data collection
Closed, InvalidPublic2 Estimated Story Points

Description

@JGirault noticed a discrepancy between the clickthrough rate on the Portal dashboard and the rate reported in the most recent Portal A/B test analysis.

The only difference is that the data used in the A/B test report underwent an additional cleaning step wherein duplicated events were removed. Therefore the clickthrough rate that is surfaced on the Portal dashboard is calculated using faulty data that has A LOT of duplicate events (any session should have at most 1 landing event and 1 clickthrough event).

While work is being done on T133730 to correct this on the event logging side, we should fix how the action/clickthrough data is collected and processed for the Portal dashboards.

We will be able to backfill about a month of "corrected data" (~57% new vs ~34% old) and this change should be annotated on the dashboard.

Event Timeline

mpopov set the point value for this task to 2.Apr 26 2016, 7:35 PM
debt renamed this task from Add de-duplication step to Portal dashboard data collection to Portal Dashboard: Add de-duplication step to data collection.Apr 26 2016, 7:47 PM
debt triaged this task as High priority.
debt added a project: Discovery-ARCHIVED.
debt updated the task description. (Show Details)
debt added subscribers: JGirault, Jdrewniak.

I don't think this task is valid anymore since there's actually no bug in the EL and it's actually working as intended. The only problem is that we're currently not accounting for first visits or subsequent visits. The CTR on the dashboard is the overall CTR. If someone goes the Portal 50 times but only clicks through 40 times, those 50 and 40 events will be used in the CTR calculation.

We might want to, I think, cancel this task or repurpose it to instead set up a new metric to track: first impression CTR (which is what I've been using to assess our A/B tests). Just a thought.

Deskana subscribed.

I don't think this task is valid anymore since there's actually no bug in the EL and it's actually working as intended. The only problem is that we're currently not accounting for first visits or subsequent visits. The CTR on the dashboard is the overall CTR. If someone goes the Portal 50 times but only clicks through 40 times, those 50 and 40 events will be used in the CTR calculation.

Closing as invalid based on this.

We might want to, I think, cancel this task or repurpose it to instead set up a new metric to track: first impression CTR (which is what I've been using to assess our A/B tests). Just a thought.

Can you file a new task for that one? We can prioritise it in our meeting tomorrow.

This comment was removed by mpopov.