Currently, eventutilities-python uses various environment variables for configuration, as well as still has some hardcoded config values( e.g. Kafka brokers).
We should refactor this so that config parameters can be passed in. Some environment variables might be necessary, but the fewer we have the better.
Doing this might be relevant for a larger question of how we'd like to parameterize py(Flink) jobs in general. It would be nice if we had a cool solution that allowed us to use a combination of config files and CLI opts. In refinery-source, we have a Scala ConfigHelper that does just this. E.g.
# config.yaml my_opt_enabled: true stream_config_uri: https://meta.wikimedia.org/w/api.php
# Load config from config.yaml, but override my_opt_enabled via CLI. my_flink_job.py --config_file=./config.yaml --my_opt_enabled=false
Done is
- All external configs to stream_manager are parameterized. No hardcoded values.