Page MenuHomePhabricator

set up DMARC aggregate report collection into a database for research and reporting
Closed, ResolvedPublic

Description

We collect DMARC aggregate (rua) reports for our common mail domains, but the reports are XML and not human-readable. Typically organizations parse and stick them in a database, and query said db.

I took a look at open source scripts for doing this and did not find anything great, so I wrote something in python (see https://gerrit.wikimedia.org/r/#/c/163881/ ), which can process an mbox or accept input on a pipe.

Whether or not we use this ^^^ script, the architecture would be similar--the parser should run on a server which is not otherwise sensitive since (at least with python . . . https://docs.python.org/2/library/xml.html) there's some risk in processing XML, The database itself would be fairly small--within a few GB.

Event Timeline

Jgreen created this task.Jan 8 2015, 5:50 PM
Jgreen raised the priority of this task from to Medium.
Jgreen updated the task description. (Show Details)
Jgreen added a project: Mail.
Jgreen added a subscriber: Jgreen.
Dzahn set Security to None.
Dzahn added a subscriber: Dzahn.

The patch has been abandoned but says there will be a new pep8 compliant version. Added Operations tag.

Jgreen changed the task status from Open to Stalled.Apr 27 2015, 9:20 PM
Jgreen lowered the priority of this task from Medium to Low.
herron added a subscriber: herron.Jun 14 2017, 3:06 PM

It looks like this task has been stalled for some time.

Today DMARC aggregate reports are being sent to an active dmarcian account which provides some search and reporting features. Dmarcian seems to be a reasonable solution although in some ways it would be nice to host this data ourselves.

As an aside there are a couple of related projects that might be interesting, rddmarc and ng-rddmarc, which perform similar dmarc to mysql import. Some frontends have been written for the schema as well.

Is this ok to resolve at this point?

herron moved this task from Backlog to In Progress on the Mail board.Jul 7 2017, 3:20 PM

Scratch that... We need to move from dmarcian to hosting reports ourselves after all. A pair of virtual machines have been provisioned (T169566) to isolate final delivery of the rua address and parsing of reports from the mail routing infrastructure.

What's the status of this ?

herron closed this task as Resolved.Dec 4 2017, 2:23 PM
herron claimed this task.

Stalled. Going to close this as the dmarcian account is working. If down the road that changes we can re-evaluate. I'll tear down the systems from T169566 as well

I'll tear down the systems from T169566 as well

What's the status of the VM removal? I noticed they're still around, so let's open a separate task to properly track this?

herron added a comment.Feb 1 2018, 8:50 PM

@MoritzMuehlenhoff VMs have been removed