Page MenuHomePhabricator

Airflow scheduler and webserver logs should be readable by airflow instance admins
Open, LowPublic

Description

Right now, airflow process logs are only in journalctl. We should

  1. make these readable on the servers by admins. Either:

    1a. Put these in rotated files using airflow logging settings or rsyslog

    1b. add airflow admins to 'systemd-journal' posix group to allow them to use journalctl
  1. send airflow process logs to logstash

We should for sure do 1. 2. would be nice to have, but maybe not necessary.

Note that this are about airflow process logs, not DAG/job logs, which are already available on disk and in the Airflow webserver UI.

Event Timeline

Ottomata added a project: Data-Engineering.
Ottomata edited subscribers, added: EChetty; removed: emil.
Gehel triaged this task as Low priority.Oct 18 2023, 8:49 AM
Gehel moved this task from Incoming to Ready for Work on the Data-Platform-SRE board.
Antoine_Quhen edited subscribers, added: Antoine_Quhen; removed: aquaerti.

Is this ticket still needed? I see that admin groups such as analytics-platform-eng-admins can run sudo journalctl -u airflow-* (https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/admin/data/data.yaml#1034)

Same goes for the following groups:

  • analytics-research-admins
  • airflow-search-admins
  • airflow-analytics-product-admins
  • airflow-wmde-admins
Gehel moved this task from Ready for Work to In Progress on the Data-Platform-SRE board.
Gehel subscribed.

@brouberol things might have changed and might already have been implemented since this ticket was created. There is also the open point about sending the logs to logstash that you could check.

brouberol updated Other Assignee, removed: brouberol.