User Story
As a data engineer, I need to build an Airflow job to generate XML files from the data generated in this ticket, so that I can check if the output of the new process matches the existing dump process
Done is:
- Job is running on daily schedule on Airflow
- We can limit the scope of this to 1 smaller wiki to make testing easier
- Output of the process matches the output of the existing process (1 wiki)
Out of scope:
- Publishing to dumps.wikimedia.org (once we test further and can increase the number of wikis this is running for then we can publish)