Help the campaigns team plan for moving to Airflow. See the slide deck presentation. Once data pipelines are on Airflow, the team can rely on Superset for data dashboards.
- Identify data categories available (data assets)
- Identify data to potentially collect within the data assets
- Test items in the potential collection list that have not been previously utilized in reporting/queries
- Device data (T367840),
- Mobile vs desktop (T359112),
- answers visualization (out of scope due to complexity)
- ambassador action output
- Share the plan and work in progress
- Cross team collaboration & planning
- Rebecca Maung & Arina
- Community team(s)
- Review planning on future metrics T365292
- Request adding special event pages to pageview whitelist T368303
- Consider
- test data from T365292 --> may be something to consider in upcoming quarters
- test data from T365407 --> data now available on demand at Pageviews Analysis
- Finalize metric list
- Submit LS3C request as needed
- Review methods to connect MariaDb directly with Airflow with KC per T362612, T362615, etc.
- KC links compilation, Airflow notes
- Technical planning
- Resources: Assess the resources required, such as data sources, computational requirements, and external systems that will interact with the pipeline.
- Workflow: Determine the tasks that need to be automated, their dependencies, and the order in which they should be executed.
- https://wikitech.wikimedia.org/wiki/Data_Platform/Dataset_creation
WIP Planning Notes Document
Analytics/Systems/Superset
Superset