As seen in wmfdata.mariadb the connection code relies on analytics-mysql utility (source), which prints out the host & port info which are then parsed.
It depends on that utility to be available at the system level. One idea is to factor that functionality out of refinery into a separate small package that can be used by both refinery and wmfdata-python, but it's not trivial to change the deployment strategy for refinery in that case.
Either way, wmfdata-py needs to get that information somehow when it's running on the cluster and used in an Airflow data pipeline.