Perform Test-retest reliability assessment on US data similar to Global data to decide if we have confidence in our method to calculate pageview data loss estimates.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Mayakp.wiki | T311560 Pageview Data loss | |||
Resolved | Mayakp.wiki | T306657 Cross-validate estimates for pageview data loss derived from February data with new March data | |||
Resolved | Mayakp.wiki | T310016 Test-retest reliability assessment on US data |
Event Timeline
Comment Actions
Calculated the correlation on US-enwiki-user data. correlation coefficient= 0.997
Re-Calculated the correlation for US only and found the same results = 0.971
Comment Actions
Thank you @Mayakp.wiki ! I know we plan to review when we're both back online, but I wanted to leave a note since I'll be out next week. Those numbers are consistent with the correlations on the global dataset. Based on what we're seeing so far, I think we should propose estimating the data loss using the average loss on affected nodes in February.
Comment Actions
@kzimmerman and I met today and we have discussed next steps. All future tasks/work can be tracked on this Epic T311560