On T335862: Implement job to generate Dump XML files, we developed PoC code to dump wiki content in XML form from data lake table wmf_content.mediawiki_content_history.
We now want to get this code to production level with a set of tasks aimed at hardening, testing, and integrating this mechanism.
This is done to support FY2025 Q1[[ https://app.asana.com/1/3758245663860/project/1210776716741007/overview/1210776805319899 | SDS 1.2.1 ]]:
If we migrate the XML Dumps process from the current 'Dumps 1' infrastructure to a data pipeline that leverages the MediaWiki Content Pipelines we will be able to guarantee SLOs and turn off the 'Dumps 1'-based XML export.