Migrate all mediawiki_history_load jobs to Airflow.
- All jobs live under refinery/oozie/mediawiki/history/load.
- They are responsible to repair each one of the mediawiki tables (adding the missing partition meta-data) after they have been imported from mariaDB by sqoop. Basically they execute MSCK REPAIR TABLE <tablename> for each table.
- The job also creates a success file named _PARTITIONED inside each table's partition directory, so that subsequent jobs know when the Hive partition meta-data is complete.
- Probably it is possible to create just 1 Airflow DAG that iterates over a list of datasets and creates the corresponding tasks. Likely, for each dataset: 1) A sensor 2) The repair table (SparkSQLOperator?) and 3) A URLTouchOperator that generates the _PARTITIONED flag.