Page MenuHomePhabricator

High Level Metrics : Future Improvements
Closed, ResolvedPublic

Description

Some ideas including those that came up during our knowledge transfer of high level metrics calculation with Connie. We can break these out into separate tasks if required.

Q1 2021-Q4 2022

Instrument

  • ETL readers notebooks
  • ETL editors notebooks

Backfill

  • T287420 - active_editors by May 27th, 2022

Tables

Calculations

  • Set up reporting code so that the GitHub Editor's repo’s Report ipynb file looks at last year and also 3 years back

Q1 2022

refactoring

  • Update Editors calculation repo and post to Github
  • Update Platform queries and data pull repo and post to Github

Calculations

  • Add functions in editing-movement notebook (03.report) to calculate net new content (non-wikidata)

Event Timeline

ldelench_wmf moved this task from Triage to Upcoming Quarter on the Product-Analytics board.
Iflorez updated the task description. (Show Details)
Iflorez updated the task description. (Show Details)
Mayakp.wiki edited subscribers, added: kzimmerman; removed: cchen.
Iflorez updated the task description. (Show Details)
Iflorez updated the task description. (Show Details)
Iflorez updated the task description. (Show Details)
Iflorez updated the task description. (Show Details)

These ideas including those that came up during our knowledge transfer of high level metrics calculation with Connie may be useful to work on in the future:

Instrument

  • ETL remaining readers - by end of june 2022

Tables

  • Also, need to look at Connie’s tables: cchen.repo_active_editors & cchen.new_editors --> when these are moved, also need to update Superset T287284 and wiki comparison T294653

refactoring

  • Refactor: combine the different inserts for active_editors

.
.
.
.

These are no longer needed:
Tuning Session reporting TM sheet in the Tuning Session notebook
[] Tuning Session: automate the calculations and output calculated for readers
~~[] Tuning Session: automate the calculations and output calculated for platform

Calculations
[] Make readers-editors google sheet more reader friendly. Remove mobile-heavy metrics tab, add new columns like YoY, FY average, quarterly average, FY YTD average etc. (reference document)
Diversity calculations
- Adding diversity (see T295332) in repo_active_editors
[] Add diversity for new and returning active editors
Diversity sheet
[] net new non-wikidata content = net new content MINUS new wikidata content
[] YOY - editors sheet (jupyter notebook has all data) - sum rows for the previous year (net new content MINUS new wikidata content )
Platform Evolution sheet calculations
[] % from baseline = status column
[] % of wiki data items ---> see wikidata items being reused (status)
[] Update the rpt repos to calculate data for the Movement metrics tables preparation sheets file~~
[] Work on consolidating and making the MMPT, Movement Metrics Preparation Table sheets more reader friendly (tables stay in MMTP only...keep YoY)

Viz
[] Fix - For some months now some of the editor global north and south charts haven't been showing up. You can see that the global north charts aren't showing on this March 2020 repo https://github.com/wikimedia-research/Editing-movement-metrics/blob/b64d7ffee70a4482a84787ae78343e0136d7ae90/03-report.ipynb
Platform Evolution R Viz
[] Starting y axis at 0
[] Ticks: Major breaks: 50 mil
[] Ticks: Minor breaks: 10mil
[] Remove x lines
[] Geom point only the latest point and those others to especially highlight
[] Colors: Switch to gray for previous year and only use blue for current year
Discuss
[] Use google sheet macro to update metrics in the correct format (instead of manually updates)~~
We are no longer going to update board deck with detailed analysis. Only summary slides will be added every month
[] Add notes and analysis to the board staging deck

Iflorez updated the task description. (Show Details)