Page MenuHomePhabricator

Apt-staging: fix logging
Closed, ResolvedPublic

Description

Follow up from T409253: Continuous breakages of apt-staging

The logging is far too noisy, the timer kicks in every five minutes and a clean run logs three lines which have no meaningful information. I think it would be best if a run which didn't import anything would simply log nothing at all. The information whether the timer ran is maintained by systemd anyway

Details

Related Changes in Gerrit:

Event Timeline

Change #1205162 had a related patch set uploaded (by Arnaudb; author: Arnaudb):

[operations/puppet@production] apt-staging: logging and metrics

https://gerrit.wikimedia.org/r/1205162

ABran-WMF changed the task status from Open to In Progress.Nov 18 2025, 7:47 AM

The updated version of the log output is a bit more verbose at info level:
{P85360}

I've also added a --dry-run flag to simplify debug/testing
Metrics are dump on stdout with this flag, as visible in the example output above ↑

Change #1205162 merged by Arnaudb:

[operations/puppet@production] apt-staging: logging and metrics

https://gerrit.wikimedia.org/r/1205162

We now have a bit more observability via metrics and logging:

root@apt-staging2001:node.d $ cat gitlab_package_puller.prom 
# HELP gitlab_package_puller_jobs_considered Number of CI jobs inspected in the last run
# TYPE gitlab_package_puller_jobs_considered gauge
gitlab_package_puller_jobs_considered 2635
# HELP gitlab_package_puller_jobs_downloaded Number of CI jobs whose artifacts were downloaded in the last run
# TYPE gitlab_package_puller_jobs_downloaded gauge
gitlab_package_puller_jobs_downloaded 8
# HELP gitlab_package_puller_run_success Whether the last run completed without unhandled exceptions (1=success, 0=failure)
# TYPE gitlab_package_puller_run_success gauge
gitlab_package_puller_run_success 1
# HELP gitlab_package_puller_last_run_timestamp_seconds Unix timestamp of the end of the last run
# TYPE gitlab_package_puller_last_run_timestamp_seconds gauge
gitlab_package_puller_last_run_timestamp_seconds 1764055480
# HELP gitlab_package_puller_jobs_download_failed Number of CI jobs whose artifacts failed to download in the last run
# TYPE gitlab_package_puller_jobs_download_failed gauge
gitlab_package_puller_jobs_download_failed 0
# HELP gitlab_package_puller_jobs_extract_failed Number of CI jobs whose artifacts failed to extract in the last run
# TYPE gitlab_package_puller_jobs_extract_failed gauge
gitlab_package_puller_jobs_extract_failed 0
# HELP gitlab_package_puller_jobs_move_failed CI jobs whose artifacts failed to move to destination dir in the last run
# TYPE gitlab_package_puller_jobs_move_failed gauge
gitlab_package_puller_jobs_move_failed 0
# HELP gitlab_package_puller_jobs_import_failed CI jobs whose packages failed to import into apt-staging in the last run
# TYPE gitlab_package_puller_jobs_import_failed gauge
gitlab_package_puller_jobs_import_failed 0
# HELP gitlab_package_puller_reprepro_notify_failed Number of times a reprepro failure notification was (or would have been) triggered in the last run
# TYPE gitlab_package_puller_reprepro_notify_failed gauge
gitlab_package_puller_reprepro_notify_failed 0

I've bumped the script's log level to info so journalctl gives a bit more information by default with journalctl -fln50 -u gitlab-package-puller.service --since "20 min ago"