Page MenuHomePhabricator

New Donors Donating On Email
Open, HighPublicBUG REPORT

Description

So I have encountered something that I think is odd, but am hoping to discuss on this task here.

The below query gets donors first donations for new email donors since 2023-07-01, returning 166,789 rows:

WITH all_funds_first_donation AS (
    SELECT
        entity_id AS contact_id
        ,LEAST(COALESCE(d.first_donation_date, d.endowment_first_donation_date),
            COALESCE(d.endowment_first_donation_date,d.first_donation_date)) AS all_funds_first_donation_date
    FROM civicrm.wmf_donor AS d )
    
SELECT
	t.utm_medium,
    t.utm_campaign,
    LEFT(t.utm_source,10) AS mailing_identifier,
    c.contact_id,
    c.id,
    c.receive_date,
    aff.all_funds_first_donation_date,
    c.total_amount
FROM civicrm.civicrm_contribution AS c
LEFT JOIN civicrm.civicrm_contribution_tracking AS t ON c.id = t.contribution_id
LEFT JOIN all_funds_first_donation as aff on c.contact_id = aff.contact_id
WHERE
    c.receive_date >= '2023-07-01'
    AND aff.all_funds_first_donation_date >= '2023-07-01'
    AND c.receive_date = aff.all_funds_first_donation_date
	AND c.contribution_status_id = 1 /* only pull completed transations */
	AND c.financial_type_id <> 14 /* remove refunds */
	AND c.contact_id <> 42572457 /* remove WMF fund-to-fund internal transfers */
    AND utm_medium = 'email';

Now what is confusing me is that should there be new donors from email?

For example, Contact ID: 63232551 donated on sp73698998, however there is no record of them receiving that email (or any email).

I think of the above query, a large part of it can be explained by RML? A new donor coming from an RML email I think makes sense if I am understanding that program correctly (we get their email on a banner, but not donation, then we email them a donation ask later).

This query shows donors first donations for new email donors with RML in utm_source the since 2023-07-01, returning 39,997 rows:

WITH all_funds_first_donation AS (
    SELECT
        entity_id AS contact_id
        ,LEAST(COALESCE(d.first_donation_date, d.endowment_first_donation_date),
            COALESCE(d.endowment_first_donation_date,d.first_donation_date)) AS all_funds_first_donation_date
    FROM civicrm.wmf_donor AS d )
    
SELECT
	t.utm_medium,
    t.utm_campaign,
    LEFT(t.utm_source,10) AS mailing_identifier,
    c.contact_id,
    c.id,
    c.receive_date,
    aff.all_funds_first_donation_date,
    c.total_amount
FROM civicrm.civicrm_contribution AS c
LEFT JOIN civicrm.civicrm_contribution_tracking AS t ON c.id = t.contribution_id
LEFT JOIN all_funds_first_donation as aff on c.contact_id = aff.contact_id
WHERE
    c.receive_date >= '2023-07-01'
    AND aff.all_funds_first_donation_date >= '2023-07-01'
    AND c.receive_date = aff.all_funds_first_donation_date
	AND c.contribution_status_id = 1 /* only pull completed transations */
	AND c.financial_type_id <> 14 /* remove refunds */
	AND c.contact_id <> 42572457 /* remove WMF fund-to-fund internal transfers */
    AND utm_medium = 'email'
    AND LEFT(t.utm_source,10) like '%RML%';

Assuming I am understanding RML correctly, then perhaps those being those donors first donation makes sense.

But that would still leave the remainder in this query showing donors first donations for new email donors without RML in utm_source the since 2023-07-01, returning 126,792 rows:

WITH all_funds_first_donation AS (
    SELECT
        entity_id AS contact_id
        ,LEAST(COALESCE(d.first_donation_date, d.endowment_first_donation_date),
            COALESCE(d.endowment_first_donation_date,d.first_donation_date)) AS all_funds_first_donation_date
    FROM civicrm.wmf_donor AS d )
    
SELECT
	t.utm_medium,
    t.utm_campaign,
    LEFT(t.utm_source,10) AS mailing_identifier,
    c.contact_id,
    c.id,
    c.receive_date,
    aff.all_funds_first_donation_date,
    c.total_amount
FROM civicrm.civicrm_contribution AS c
LEFT JOIN civicrm.civicrm_contribution_tracking AS t ON c.id = t.contribution_id
LEFT JOIN all_funds_first_donation as aff on c.contact_id = aff.contact_id
WHERE
    c.receive_date >= '2023-07-01'
    AND aff.all_funds_first_donation_date >= '2023-07-01'
    AND c.receive_date = aff.all_funds_first_donation_date
	AND c.contribution_status_id = 1 /* only pull completed transations */
	AND c.financial_type_id <> 14 /* remove refunds */
	AND c.contact_id <> 42572457 /* remove WMF fund-to-fund internal transfers */
    AND utm_medium = 'email'
    AND LEFT(t.utm_source,10) not like '%RML%';

Another example, Contact ID 59803786 donated on sp73084724, however there is no record of them receiving that email (or any email).

Could these 126,792 rows have been missed by dedupe?

Is there something I am looking at the wrong way/querying wrong here?