Last month (August snapshot) we had to tweak the Commons Impact Metrics job to be able to read from the newly modified *links tables (T404735).
This month, the job finished successfully, but the datasets are empty.
This task is to troubleshoot, look for root cause, and fix this issue.
Description
Description
Details
Details
Related Changes in GitLab:
| Title | Reference | Author | Source Branch | Dest Branch | |
|---|---|---|---|---|---|
| Add sensor for linktarget table in Commons Impact Metrics DAG | repos/data-engineering/airflow-dags!1748 | mforns | add-linktarget-to-cim-sensors | main |
Related Objects
Related Objects
Event Timeline
Comment Actions
mforns updated https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1748
Add sensor for linktarget table in Commons Impact Metrics DAG
Comment Actions
After some troubleshooting I saw that, when we added the linktarget table as a datasource for Commons Impact Metrics, we forgot to add the corresponding sensor.
This made it so that the September DAG run started before the linktarget data was properly loaded to the data lake, and so the CIM job produced empty results.
The MR above adds the proper sensor to the DAG.
Comment Actions
mforns merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1748
Add sensor for linktarget table in Commons Impact Metrics DAG