Blocked by T368745: MediaWiki reconciliation API and event enrichment pipeline, and T368753: Implement production mechanism that emits (wiki_db, revision_id) pairs for missing or inaccurate rows.
Consuming these new events will require a new MERGE INTO job to be put together, very similar to the existing events_merge_into.py. In fact, hopefully just changing the source table in this pipeline should suffice as the schema should be the same.
In this task we should:
- Implement the PySpark MERGE INTO job
- Incorporate the running of this job as part of the Airflow DAG created on T368753.