We have successfully begun a cross-team project examining Incubator and language representation across Wikimedia projects.
The project include the following goals:
- Develop metrics for the state languages at Wikimedia
- Develop metrics for better understanding Incubator
- Develop knowledge gaps metrics for measuring language gaps
This task addresses the goal of developing knowledge gaps metrics for measuring language gaps.
For Q2/Q3, I will
- Finish integrating primary data via wrangling scripts in GitLab repo
- Acquire the needed secondary data
- Integrate secondary data via wrangling scripts in GitLab repo
- Determine home for the integrated dataset(s) (Hive? keep in Gitlab?)
- Begin calculations for metrics
- QA
This task has a dependency on task T348249 that needs to be resolved in order to acquire the needed secondary data
January 2024 update: Due to the unresolved dependency, this task is going to take multiple quarters until blocker is resolved.

