Page MenuHomePhabricator

Additional features in Pageview Loss calculation
Closed, DeclinedPublic

Description

In T304876 we created an R function to calculate Pageview data loss.
Currently, we can filter and calculate data loss for filters like country_code, access_method, referer_class

We want to use this function for adding data loss estimates in the Readers metrics charts

De-scoped
We want to be able to get the data loss estimates by new dimensions such as-- project

Event Timeline

Mayakp.wiki renamed this task from Additional dimensions in Pageview Loss calculation to Additional features in Pageview Loss calculation.Apr 19 2022, 5:15 PM
Mayakp.wiki updated the task description. (Show Details)
mpopov triaged this task as High priority.Apr 19 2022, 5:15 PM
mpopov edited projects, added Product-Analytics (Kanban); removed Product-Analytics.

This is going to pretty easy – just changing the queries and re-querying the data. The actual function is flexible enough to accommodate the additional dimension without problems.

Added scope to this task, since @mpopov is already working on it.

  1. We want to use this function for adding data loss estimates in the Readers metrics charts

Update: this needs re-scoped because adding the project into the mix makes the data much, much bigger.

Should it just be the en6 countries since that's the intended application? Should it be some projects (perhaps just the wikipedias?) or just enwiki?

We want to use this function for adding data loss estimates in the Readers metrics charts

Regarding increasing the scope to package it all up for use with metrics charts… let's not? There's currently the one chart and we already have a working solution. Until there are more charts and a clear need for a different solution, let's dial back the scope here.

mpopov removed mpopov as the assignee of this task.May 9 2022, 7:23 PM
mpopov edited projects, added Product-Analytics; removed Product-Analytics (Kanban).
mpopov moved this task from Triage to Current Quarter on the Product-Analytics board.

@Mayakp.wiki: What do you want to do here?

@mpopov :

  1. Ok to table the addition of 'project' dimension until we have a better way to handle the size of the data. I would prioritize adding enwiki if we get any other (urgent) requests from Fundraising (which shouldn't be any time soon).
  1. Using the function for visuals

It would be useful to keep this in the Current quarter and discuss further. @kzimmerman and I have been talking about what it would mean to 'close out pageview dataloss' and I feel this package could be a potential deliverable because we will have to keep referring to the dataloss in any future pageview analyses. I also see a need to include the dataloss on other charts like referrer class and access method.

mpopov removed mpopov as the assignee of this task.Dec 6 2022, 5:30 PM
mpopov edited projects, added Product-Analytics; removed Product-Analytics (Kanban).

Declining this task, as our emphasis has shifted to unique devices, and the amount of work needed would be high.