Page MenuHomePhabricator

OTRS receiving flood of emails
Closed, ResolvedPublic

Description

Noticed in info-en-c, other queues reporting it as well

Event Timeline

Krenair triaged this task as Unbreak Now! priority.Jan 24 2019, 5:10 PM

Began around 16:30, @akosiaris is looking into it

Mentioned in SAL (#wikimedia-operations) [2019-01-24T17:35:03Z] <akosiaris> freeze all current info@wikipedia.org emails on mx1001, mx2001 T214604

Mentioned in SAL (#wikimedia-operations) [2019-01-24T17:44:44Z] <akosiaris> block specific IPv4, IPv6 address on mx1001, mx2001 T214604

Mentioned in SAL (#wikimedia-operations) [2019-01-24T17:50:12Z] <akosiaris> restart exim on mendelevium T214604

A quick eyeball at the emails show it's a pretty standard "spoofing our email as a from address and reply-to address" spam. This happens about once every six months to a year but this is by far the biggest I've seen.

OTRS probably spam queue is totally flooded by the SMTP failed delivery messages

Aklapper renamed this task from OTRS recieving flood of emails into info-en-c to OTRS receiving flood of emails into info-en-c.Jan 24 2019, 6:46 PM

It's not only info-en-c, other queues are receiving this as well. Many messages get immediately stuck in the probably spam queue, others are being manually moved from other queues into that one, apparently (no idea why, people should be moving them directly to junk instead of contributing to the flood)

Mentioned in SAL (#wikimedia-operations) [2019-01-24T19:23:29Z] <akosiaris> delete 5076 tickets from OTRS with customerID MAILER-DAEMON@ubuntu.member.linode.com T214604

The email storm can be witnessed at https://grafana.wikimedia.org/d/000000451/mail?orgId=1&from=1548346803405&to=1548357438520&var-datasource=codfw%20prometheus%2Fops (this is the secondary DC) and https://grafana.wikimedia.org/d/000000451/mail?orgId=1&from=1548346740714&to=1548358041101&var-datasource=eqiad%20prometheus%2Fops (for the primary DC). The 2 distinct phases are there because of me freezing a ton of emails which later got thawed and eventually delivered.

Mentioned in SAL (#wikimedia-operations) [2019-01-24T19:32:48Z] <akosiaris> delete 5076 tickets from OTRS with customerID Mailer-Daemon@wizengo.ds.planet-work.net T214604

Mentioned in SAL (#wikimedia-operations) [2019-01-24T19:33:09Z] <akosiaris> delete 8505 tickets from OTRS with customerID Mailer-Daemon@wizengo.ds.planet-work.net T214604 - correction

akosiaris lowered the priority of this task from Unbreak Now! to Low.Jan 24 2019, 7:38 PM

info-en-c seems to be down to 167 messages now and the hosts participating in the storm remain blocked. I 'll lower priority for now.

On the OTRS interface I set up the filter `00 spam-20190124 smtp-sortant2.phpnet.org`.
The remaining tickets have been moved to Junk.

Again, lots of messages coming in (noticed at "probably spam" OTRS queue)

It seems under control to me. There are spam emails from different sources in "Probably-spam", which is normal, and they have to be moved to "Junk" if they are really spam mails.

There are few emails from MAILER-DAEMON@smtp-sortant10.phpnet.org and I am setting up a new filter for it, just in case. Thanks

Krenair renamed this task from OTRS receiving flood of emails into info-en-c to OTRS receiving flood of emails.Jan 25 2019, 2:13 PM
Krenair updated the task description. (Show Details)

Junk is up to 19021 and rising fast. At least they're not going into proper queues now.

Cleaned up some 10k emails from 2 more host with the same pattern as yesterday and blocked them as well.

Graphs in codfw mail[1] and eqiad mail[2] point out that this behavior has not reemerged since Jan 25, so I 'll tentatively close this as resolved. Feel free to reopen

[1] https://grafana.wikimedia.org/d/000000451/mail?orgId=1&from=1548401559794&to=now&var-datasource=codfw%20prometheus%2Fops
[2] https://grafana.wikimedia.org/d/000000451/mail?orgId=1&from=1548401599018&to=now&var-datasource=eqiad%20prometheus%2Fops