Page MenuHomePhabricator

[Airflow] Add DAG subfolder name to error email's subject
Closed, ResolvedPublic

Description

The names of the data-quality jobs don't contain data-quality, and error emails are therefore not super explicit: TaskInstance: mobile_os_distribution.
It'd be great if the name mention data-quality.

Event Timeline

@JAllemandou I changed the name from data quality to anomaly detection, because this job family (covered by the AnomalyDetection DAG factory) includes things like the traffic anomaly checks, which are not data quality per se.
Now, I understand your point, in general the folder name that groups a set of DAGs is not included in the error email subject. It just includes the DAG_ID. Will update the title of this task.
What I did so far, was add the mentioned folder name as a TAG, that can be used in Airflow's Home page to filter the DAGs. You can also filter by the DAG_id, even partial strings match, but I thought it would be more clear if there were tags, also not to make the DAG_IDs very long.

We can maybe use a custom email modifier to add tags to the email subject, see:
https://stackoverflow.com/questions/51726248/airflow-dag-customized-email-on-any-of-the-task-failure
We could use the same one for all our Airflow tasks.

Or else, we could just have the DAG_IDs all have the folder prefix, as you suggest.

mforns renamed this task from Add data-quality to airflow DAGs' name to [Airflow] Add DAG subfolder name to error email's subject.Feb 11 2022, 2:59 PM

Actually, having all tags in the error email subject, might not be a good idea...
It would tie us to using just 1 tag per DAG, otherwise the email subject would be too long.
And we probably want to have the freedom to add more than 1 tag per DAG.
Sooo, I think, prefixing is the right thing!

Thanks a lot for looking into this @mforns :)
Prefixing is not "pretty", but it's a low-tech win :)

JArguello-WMF claimed this task.