[02:20:56] <MatmaRex> i stopped receiving any email from gerrit and phabricator a couple hours ago. is it just me?
[02:21:16] <MatmaRex> e.g. i have no emails for any of the comments on https://phabricator.wikimedia.org/T196588
[02:22:01] <MatmaRex> the last one i received at was 19:12 UTC
Description
Related Objects
Event Timeline
https://grafana.wikimedia.org/dashboard/db/mail?refresh=5m&orgId=1&from=now-2d&to=now suggests all mail is down, but I'm still getting mailman emails...
https://tools.wmflabs.org/sal/log/AWPWdA2jwY2u4JUTWWH1 is probably related.
Mail seems to be generally flowing through codfw though... https://grafana.wikimedia.org/dashboard/db/mail?refresh=5m&orgId=1&from=now-2d&to=now&var-datasource=codfw%20prometheus%2Fops
Mentioned in SAL (#wikimedia-operations) [2018-06-07T02:00:29Z] <paravoid> starting exim4 and reenabling puppet on mx1001, due to T196598
The cause was the prep for T175361, in combination with a couple of unexpected misconfigurations/SPOFs, given it's been years since the switchover from mx1001->mx2001 has been tested.
I restarted exim above which fixed this. Phabricator even queued the emails and is sending them now, but unfortunately I don't think that will be the case with others :(
I'll resolve this and we can follow up on the other task for the rest of the investigation.