This has failed twice so far, May 21 & 22 2019.
Per email alerts the command run is:
export PYTHONPATH=${PYTHONPATH}:/srv/deployment/analytics/refinery/python && /srv/deployment/analytics/refinery/bin/refinery-drop-hive-partitions -d 90 -D discovery -t query_clicks_hourly,query_clicks_daily >> /var/log/refinery/drop-query-clicks.log
The error message is:
/bin/sh: 1: cannot create /var/log/refinery/drop-query-clicks.log: Permission denied
The cron job comes from puppet, in modules/profile/manifests/analytics/refinery/job/data_purge.pp
Looking at that file I see there is one other users of refinery-drop-hive-partitions (and one call that uses ensure => absent). The structure is a bit different though, cirrus uses cron where the other uses profile::analytics::systemd_timer. Additionally the other provides a -f argument with a log file, rather than piping stdout.
Does the cirrus line for ensuring we delete data in a timely manner need to be updated to some new standard for running recurring tasks in the analytics cluster, and perhaps be aligned ?
The related changes to the other user (mediawiki-raw-cu-changes-drop-month) are:
I97446c4f702a1ffa0db8ee07e4e0fb80cf3fe2ec : Use standard logging approach similar to sqoop job
I8528b190774922165971aee9a1fbc0a3e0fdc953 : profile::analytics::refinery::job::data_purge: move crons to timers
Also possible the above are simply changes, but not related to this exact failure.