Page MenuHomePhabricator

toolforge-jobs emails not working
Closed, ResolvedPublicBUG REPORT

Description

On 2022-09-16 between 06:30 and 14:30, the jjmc89-bot tool had multiple pod failures according to Grafana; however, no emails were received. All jobs have emails: onfailure set.

Event Timeline

There were more on 2022-09-28 between 11:39 and 12:20 that I did not receive emails for. (Grafana)

JJMC89 renamed this task from No emails on pod failure to toolforge-jobs emails not working.Oct 7 2022, 7:02 PM
JJMC89 added a subscriber: PeterBowman.

None of the emails are working per a report on IRC.

<PeterBowman> hello, are email notifications in toolforge-jobs actually working? I can successfully send messages to myself via tools.tool-name@tools.wmflabs.org, which I think is the address being used by the framework under the hood, but the --emails option has no effect for me

Here are my Grafana logs. I didn't receive any messages neither on successful nor failing jobs, regardless of the selected value for --emails (onfailure/onfinish/all).

There is something wrong with the emailer daemon:

[..]
2022-08-20 05:00:33 INFO: 1 new pending emails in the queue, new total queue size: 1
2022-08-20 05:00:40 INFO: Sending email FROM: noreply@toolforge.org TO: tools.arkivbot@tools.wmflabs.org via mail.tools.wmflabs.org:25
2022-08-20 05:07:17 INFO: 1 new pending emails in the queue, new total queue size: 1
2022-08-20 05:07:17 INFO: Sending email FROM: noreply@toolforge.org TO: tools.arkivbot@tools.wmflabs.org via mail.tools.wmflabs.org:25
2022-08-20 06:43:01 INFO: 1 new pending emails in the queue, new total queue size: 1
2022-08-20 06:43:01 INFO: Sending email FROM: noreply@toolforge.org TO: tools.earwigbot@tools.wmflabs.org via mail.tools.wmflabs.org:25

It is indeed apparently not sending emails since a couple months ago.

Mentioned in SAL (#wikimedia-cloud) [2022-10-10T11:35:36Z] <arturo> aborrero@tools-k8s-control-1:~$ sudo -i kubectl -n jobs-emailer rollout restart deployment/jobs-emailer (T317998)

aborrero claimed this task.
aborrero triaged this task as Low priority.

The emailer component seems to be happy again. Please reopen if required. We will track improvements on subtask T320405: toolforge jobs-framework-emailer: increase reliability.