Page MenuHomePhabricator

[Epic] Commons Impact Metrics Implementation
Open, Needs TriagePublic

Description

This epic includes all the work related to develop and productionize the data pipeline for Commons Impact Metrics:
Queries and Spark code, Airflow jobs, Dumps, Public API, Allow-list management, Documentation and Applying insights from the Community feedback to the data model.

Step 0:
T358688: [Commons Impact Metrics] Understand feedback from Community and decide what changes to apply
T358695: [Commons Impact Metrics] Establish how we represent the allow-list

Step 1:
T358681: [Commons Impact Metrics] Productionize SparkSQL and Spark-Scala -> T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg
T358679: [Commons Impact Metrics] Design API endpoints and Cassandra/Druid datasources

Step 2:
T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints
T358707: [Commons Impact Metrics] Create Airflow job that formats and loads the data to Cassandra for AQS

Step 3:
Continue T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints -> T358715: [Commons Impact Metrics] Add test data in AQS's test environments to back up new AQS service
T358719: [Commons Impact Metrics] Backfill datasets in Iceberg and Cassandra/Druid
T358722: [Commons Impact Metrics] Create API documentation

Step 4:
T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps -> T358710: [Commons Impact Metrics] Make dumps accessible from analytics.wikimedia.org
T358720: [Commons Impact Metrics] Create documentation of the main pipeline
T358712: [Commons Impact Metrics] Implement necessary tools and process to maintain the allow-list

Related Objects

StatusSubtypeAssignedTask
OpenNone
Resolvedmforns
Openxcollazo
ResolvedVirginiaPoundstone
Resolvedxcollazo
Openmforns
OpenSGupta-WMF
Openxcollazo
OpenNone
OpenNone
OpenNone
OpenNone
OpenMilimetric
In ProgressSGupta-WMF
In ProgressSGupta-WMF
OpenSGupta-WMF
OpenSGupta-WMF
OpenMilimetric
OpenNone
OpenNone
OpenNone
ResolvedBTullis
OpenNone

Event Timeline