Open question (probably for @mforns): should we use an Airflow Operator (probably subclassing Spark or SparkSQL), or a DAG factory?
There are many (16) jobs to implement, some parameters can be defaulted:
- HQL year, month, day, hour
- cassandra host, port, user, password
others need to be defined:
- cassandra keyspace and table
- other HQL parameters