https://www.mediawiki.org/wiki/Extension:ContentTranslation/cx_corpora_table
to be loaded to Data Lake at
wmf_product.cx_corpora
- HQL/SQL queries (creation & load)
- https://gitlab.wikimedia.org/repos/product-analytics/data-pipelines/-/blob/main/content_translation/cx_extension/create_cx_corpora_table.hql?ref_type=heads
- https://gitlab.wikimedia.org/repos/product-analytics/data-pipelines/-/blob/main/content_translation/cx_extension/load_cx_corpora_table.sql?ref_type=heads
- Job repo script
- https://gitlab.wikimedia.org/repos/product-analytics/content-translation-airflow-jobs/-/blob/main/content_translation_airflow_jobs/import_cx_corpora.py?ref_type=heads
- Airflow DAG
- https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/analytics_product/dags/content_translation/cx_corpora_monthly_dag.py