Page MenuHomePhabricator

Mail to root@lists1001.wikimedia.org from noreply@lists1001.wikimedia.org doesn't work
Closed, ResolvedPublic

Description

2021-04-21 00:31:36 H=localhost (lists1001.wikimedia.org) [::1]:50190 I=[::1]:25 F=<noreply@lists1001.wikimedia.org> temporarily rejected RCPT <root@lists1001.wikimedia.org>: failed to open /etc/aliases for linear search: Permission denied (euid=114 egid=121)
legoktm@lists1001$ ls -l /etc/aliases
-rw------- 1 root root 198 Feb 20  2020 /etc/aliases

Why is it only readable by root? On lists1002 it's world readable, but on e.g. mx1001 it's only readable by root again.

Event Timeline

Legoktm created this task.

Mentioned in SAL (#wikimedia-operations) [2021-04-21T15:54:21Z] <legoktm> T280744: legoktm@lists1001:~$ sudo chmod 644 /etc/aliases

Now we're at:

2021-04-21 15:56:55 H=localhost (lists1001.wikimedia.org) [::1]:57120 I=[::1]:25 sender verify fail for <noreply@lists1001.wikimedia.org>: Mailing list noreply does not exist.
2021-04-21 15:56:55 H=localhost (lists1001.wikimedia.org) [::1]:57120 I=[::1]:25 F=<noreply@lists1001.wikimedia.org> rejected RCPT <root@lists1001.wikimedia.org>: Sender verify failed

From what I can see, it means the router rules has not been matched(!) The router rule is exactly the same as the exim4 smarthost config so I'm checking further.

My test mails went through. Might be exim4 needed restart?

Also /etc/aliases on lists1002 has much more options that lists1001:

# HEADER: This file was autogenerated at 2021-03-30 21:51:17 +0000
# HEADER: by puppet.  While it can still be managed manually, it
# HEADER: is definitely not recommended.
# /etc/aliases
mailer-daemon: postmaster
postmaster: root
nobody: root
hostmaster: root
usenet: root
news: root
webmaster: root
www: root
ftp: root
abuse: root
noc: root
security: root
root: root@wikimedia.org

vs.

# HEADER: This file was autogenerated at 2020-02-20 17:22:36 +0000
# HEADER: by puppet.  While it can still be managed manually, it
# HEADER: is definitely not recommended.
root: root@wikimedia.org

I don't have much rights to debug it further :'(

Two moving parts here in the exim config, comparing lists1001's config to other hosts where root mail does work.

#1 is that lists1001's config (via exim4.conf.mailman.erb) says require verify = sender and other hosts don't.

We could delete that and skip the sender verification step, but we may not want to skip it for list traffic, only for these local mails.

And #2 is that other hosts (via exim4.minimal.production.erb) say this, and lists1001 doesn't:

begin rewrite

# Rewrite the envelope From for mails from internal servers in *.wmnet,
# as they are usually rejected by sender domain address verification.

*@$primary_hostname     root@wikimedia.org      F

As @Legoktm pointed out we don't care about the *.wmnet part, because lists's FQDN (and thus $primary_hostname here) is under wikimedia.org, but I'm not sure if we'll still need the full rewrite to root@wikimedia.org in order to get around the "Mailing list noreply does not exist."

My best guess is to try addressing #1 first, ideally with an extra statement that skips out on sender verification for local traffic. Not touching it right now because it's last thing on a Friday and I've got the post-vaccine brain fog, but @Legoktm or @Ladsgroup feel free to send a patch if you get there before I do.

Legoktm renamed this task from Mail to root@lists1001.wikimedia.org doesn't work because of /etc/aliases file permissions to Mail to root@lists1001.wikimedia.org from noreply@lists1001.wikimedia.org doesn't work.May 7 2021, 12:44 AM

I copied the code out of systemd-timer-mail-wrapper and ran it interactively to see if I could get it to work.

>>> from email.message import EmailMessage
>>> from socket import getfqdn
>>> getfqdn()
'lists1001.wikimedia.org'
>>> import smtplib
>>> msg=EmailMessage()
>>> msg['From'] = 'TEST <noreply@{}>'.format(getfqdn())
>>> msg['To'] = 'root@lists1001.wikimedia.org'
>>> msg['Subject'] = 'test please ignore'
>>> msg.set_content('please ignore')
>>> print(msg.as_bytes())
b'From: TEST <noreply@lists1001.wikimedia.org>\nTo: root@lists1001.wikimedia.org\nSubject: test please ignore\nContent-Type: text/plain; charset="utf-8"\nContent-Transfer-Encoding: 7bit\nMIME-Version: 1.0\n\nplease ignore\n'
>>> smtp = smtplib.SMTP('localhost')
>>> smtp.send_message(msg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/smtplib.py", line 967, in send_message
    rcpt_options)
  File "/usr/lib/python3.7/smtplib.py", line 881, in sendmail
    raise SMTPRecipientsRefused(senderrs)
smtplib.SMTPRecipientsRefused: {'root@lists1001.wikimedia.org': (550, b'Verification failed for <noreply@lists1001.wikimedia.org>\nMailing list noreply does not exist.\nSender verify failed')}

There is no such list as noreply, duh, but the error message mentioned "Sender", so I set a valid "Sender" header:

>>> msg['Sender'] = 'root@lists1001.wikimedia.org'
>>> smtp.send_message(msg)
{}

and then it worked!

I think the mailman exim config is stricter than the rest of the cluster, but I think refusing to deliver emails on behalf of non-existing lists seems pretty sensible. I'll submit a patch adding the Sender header.

Change 685980 had a related patch set uploaded (by Legoktm; author: Legoktm):

[operations/puppet@production] systemd-timer-mail-wrapper: Set "Sender" header on emails

https://gerrit.wikimedia.org/r/685980

Change 685980 merged by Jbond:

[operations/puppet@production] systemd-timer-mail-wrapper: Set "Sender" header on emails

https://gerrit.wikimedia.org/r/685980

Legoktm claimed this task.

Triggered the unit and it successfully sent the email, yay!