Page MenuHomePhabricator

SRE query: Is it possible to measure how many e-mails are sent to "black hole" e-mail addresses?
Open, Needs TriagePublic

Description

For the parent task, we'd like to know just how many e-mails are getting sent to the black hole e-mail addresses we have as the Reply-To on various system e-mails users get from MediaWiki. Is there a sane way of doing this with our current e-mail set-up, or is the system so efficient it doesn't log e-mails being sent to /dev/null? :-)

Event Timeline

faidon assigned this task to herron.Aug 21 2018, 10:46 PM
faidon added a project: Mail.

Yes, messages using the blackhole transport are indeed logged. What timeframe are you looking to measure? Today we can pull counts from the past 60 days worth of logs, and I think going forward it would make sense to add per-transport counts to the Grafana mail dashboard.

@Jdforrester-WMF -- to put some specific around this, here's what I would like to find out. I leave it to you to translate this into the technical language that can help actually execute the work. I realize that some of this may not be data that we have at our fingertips. If not, it is fine to tailor this more narrowly to what we have readily available.

  • How many times per day do people reply to emails they get from the wikis (the numerator)? Out of how many sent (the denominator)? To just group by day would be great.
  • How does this break down by the type of email they receive? Like there is a difference between when you get pinged, and when a page you're watching changes?
  • How does this break down by wiki?
  • How does this break down by how long the user has been registered?

Looking at this just over the last 60 days is totally fine.

Adding Morten, our team's data analyst, as this involves data.

@Jdforrester-WMF -- to put some specific around this, here's what I would like to find out. I leave it to you to translate this into the technical language that can help actually execute the work. I realize that some of this may not be data that we have at our fingertips. If not, it is fine to tailor this more narrowly to what we have readily available.

Translations ahoy:

  • How many times per day do people reply to emails they get from the wikis (the numerator)? Out of how many sent (the denominator)? To just group by day would be great.

From the way @herron responded, it sounds like we have aggregate counts on a per-time-unit basis, rather than detailed specifics. If so, that means the numerator by day would be pretty easy to get (though see below).

I don't know if the denominator (number of e-mails sent from the job runners) are measured – @herron, don't suppose you know? Also, are e-mails only despatched from there or are there other sources (beyond FR-tech) of such e-mails?

  • How does this break down by the type of email they receive? Like there is a difference between when you get pinged, and when a page you're watching changes?
  • How does this break down by wiki?

I'm pretty sure this will be hard to measure.

Top-level types I think are:

  • Notifications ("Echo") e-mails (mostly opt-in on a per-sub-type level)
  • ENOTIF ("Watchlist") e-mails (opt-in feature at the type level only)

There are sub-types to each of these (Talk page message, Thanks, Mentions, Failed log-ins, etc.; Page edited, created, deleted, moved)

Right now all these e-mails have both their fields for from and reply-to set to wiki@wikimedia.org, regardless of type of e-mail or wiki.

If we wanted to measure this, we could vary the addresses by type and/or wiki, but this would potentially be pretty disruptive for our users (I for one filter e-mails from wiki@wikimedia.org so they don't go to my inbox). If we didn't do that, I'm not sure of alternative paths to getting the data; we could vary just the reply-to (which presumably people don't filter on) but (a) we'd need to set that up in the mail system to create N(number of types to measure) x M(number of wikis to measure for) black hole addresses, and that only gets us the numerator unless there's something I'm missing?

  • How does this break down by how long the user has been registered?

I don't think we can reasonably measure that at all, sorry. To do that we'd have to tie each e-mail sent and received to each specific user and then measure against them. That's an awful lot of infrastructure to build.

Looking at this just over the last 60 days is totally fine.

I don't know if the denominator (number of e-mails sent from the job runners) are measured – @herron, don't suppose you know? Also, are e-mails only despatched from there or are there other sources (beyond FR-tech) of such e-mails?

I am not sure about that off hand. But if we can identify a unique set of envelope attributes to these messages (sender/recipient/sending host) we could pull a count form the MX logs based on that.

Right now all these e-mails have both their fields for from and reply-to set to wiki@wikimedia.org, regardless of type of e-mail or wiki.

This is going to be the big constraint in pulling counts from mail logs. We can generate high level counts based on envelope data, but to get more granular we would have to inspect email message contents on the MX which is not feasible.

If we wanted to measure this, we could vary the addresses by type and/or wiki, but this would potentially be pretty disruptive for our users (I for one filter e-mails from wiki@wikimedia.org so they don't go to my inbox). If we didn't do that, I'm not sure of alternative paths to getting the data; we could vary just the reply-to (which presumably people don't filter on) but (a) we'd need to set that up in the mail system to create N(number of types to measure) x M(number of wikis to measure for) black hole addresses, and that only gets us the numerator unless there's something I'm missing?

In addition, the varied field would need to be within the message envelope in order to be reflected in the MX logs. Sadly reply-to would not be sufficient.

I'm inclined to suggest that these types of detailed metrics would be more straightforward to implement within the system generating the messages. Counters could be incremented as messages are sent.

MMiller_WMF added a subscriber: herron.

Assigning this to @nettrom_WMF so he can drive it from here. This is a medium-priority need for the Growth team. We definitely want to know, but are not currently blocked by not knowing the answer. However, if the answer here is a high number, it will materially change our roadmap.

herron moved this task from Backlog to Radar on the User-herron board.Sep 20 2018, 3:57 PM
kostajh added a subscriber: kostajh.

Growth-Team discussed in triage this week, we would like to get to this in Q3.

herron moved this task from Backlog to Radar on the Mail board.Nov 9 2018, 9:50 PM
JTannerWMF moved this task from Q2 2019-20 to Q1 2019-20 on the Growth-Team board.Jun 17 2019, 6:52 PM
JTannerWMF moved this task from Q1 2019-20 to FY 2019-20 on the Growth-Team board.Jul 5 2019, 8:41 PM
JTannerWMF added a subscriber: JTannerWMF.

We will return to this with the a future project, Engagement Emails.