Help the campaigns team plan for moving to [[ https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Airflow | Airflow ]]. See the [[ https://docs.google.com/presentation/d/1c1EUdYXpcepwuibSbwJMi6li3HSB1boZICvsJzFWf00/edit#slide=id.p | slide deck presentation ]]. Once data pipelines are on Airflow, the team can rely on [[ https://wikitech.wikimedia.org/wiki/Superset | Superset ]] for data dashboards.
[x] Identify data categories available (data assets)
[x] Identify data to potentially collect within the data assets
[x] Test items in the potential collection list that have not been previously utilized in reporting/queries
- Device data (T367840),
- Mobile vs desktop (T359112),
- answers visualization (out of scope due to complexity)
- ambassador action output
[x] Share the plan and work in progress
[x] Cross team collaboration & planning
- Rebecca Maung & Arina
- Community team(s)
[x] Review planning on future metrics T365292
[x] Request adding special event pages to pageview whitelist T368303
[] Consider
- test data from T365292
- test data from T365407
[] Finalize metric list
[] Submit LS3C request as needed
[] Review methods to connect MariaDb directly with Airflow with KC per T362612
* [[ https://docs.google.com/document/d/1FtQ6dfGVJhdsTOXizrOKAqrxsg7eyUJyiJoGziL2hVo/edit | KC links compilation ]], [[ https://docs.google.com/document/d/1jp_JUTV1BB3teg4uOTIRPq6RAjFE9-V892ZgeVEaNMc/edit | Airflow notes ]]
[] Technical planning
- Resources: Assess the resources required, such as data sources, computational requirements, and external systems that will interact with the pipeline.
- Workflow: Determine the tasks that need to be automated, their dependencies, and the order in which they should be executed.
[[ https://docs.google.com/document/d/1Hf-T8FO1vHq5ENdBXYyDyPwLlLHfetvDBE_kyaszhUY/edit | WIP Planning Notes Document ]]
[[ https://wikitech.wikimedia.org/wiki/Analytics/Systems/Superset | Analytics/Systems/Superset ]]
[[ https://wikitech.wikimedia.org/wiki/Superset | Superset ]]