Page MenuHomePhabricator

[Spike] investigate contribution tracking data (was Engage import failing to import certain significant fields)
Open, MediumPublic2 Estimated Story Points

Description

Sam and I were pulling a bunch of data together for our big metrics presentation next week, and we're digging into why there are over $1m worth of donations with NULL values in both civicrm.civicrm_contribution.trxn_id and drupal.contribution_tracking.utm_medium (see count using the following query):

SELECT SUBSTRING_INDEX(cc.trxn_id, ' ', 1) trxn_id_gateway, count(cc.id), sum(cc.total_amount)
FROM civicrm.civicrm_contribution cc
LEFT JOIN drupal.contribution_tracking ct ON cc.id=ct.contribution_id
WHERE cc.receive_date >= '2015-07-01'
AND cc.receive_date < '2016-07-01'
AND cc.total_amount < 1000
AND ct.utm_medium IS NULL
GROUP BY trxn_id_gateway;

I looked into the 23578 donations in this NULL trxn_id category, and 22558 (95%) have a financial_type_id = 21, which is Engage.

SELECT count(cc.id)
FROM civicrm.civicrm_contribution cc
LEFT JOIN drupal.contribution_tracking ct
ON cc.id=ct.contribution_id
WHERE cc.receive_date >= '2015-07-01'
AND cc.receive_date < '2016-07-01'
AND cc.total_amount < 1000
AND ct.utm_medium IS NULL
AND cc.trxn_id IS NULL
AND cc.financial_type_id ='21';

Based on query #1 above, there are more Engage donations without utm data or contribution IDs than there are Engage donations with a complete contribution record. I can only assume the import is not matching a bunch of this data. Is that right? Can we fix this so it's easier to reconcile our data in future years?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 21 2016, 5:10 PM

Please associate at least one project with this task to allow others to find this task when searching in the corresponding project(s). Thanks! :)

ggellerman triaged this task as Medium priority.Sep 27 2016, 9:16 PM
ggellerman set the point value for this task to 2.

@LeanneS can you provide any additional information here? The examples I looked at seemed to be EFT deposits like https://civicrm.wikimedia.org/civicrm/contact/view?reset=1&cid=2231213 (where it looks like there ?might? be duplication of the import?

It was imported on August 23rd, 2016 By Ellen. I'm wondering if there are any clues in the file - but probably that is long gone? Would they have used 'Engage' or 'Foreign Checks'

There was a commit on Aug 2 that caused Campaigns to be created in CiviCRM if they did not exist - I'm checking to see WHEN these occurred now

commit d75c927c60e41335a939d9bed3d5e0ad150ce192
Author: eileen <emcnaughton@wikimedia.org>
Date: Tue Aug 2 16:32:23 2016 +1200

Create Utm_campaign option if it does not exist.

This is an optional extra on top of the last patch that causes a missing
option_value to be created for the campaign if it does not exist. I added it as
Caitlin indicated she thought it was a good idea (on the ticket).

Bug: T139969
Change-Id: Id8f2c79237b4a5420c13b619f15d02eee56338cf

Hmm - There are still being created - e.g

https://civicrm.wikimedia.org/civicrm/contact/view?reset=1&cid=505501

But what would we want to look different about it?

SElect YEAR(receive_date), MONTH(receive_date), count(*), contact_id FROM civicrm.civicrm_contribution cc LEFT JOIN drupal.contribution_tracking ct ON cc.id=ct.contribution_id WHERE cc.receive_date >= '2014-07-01' AND cc.total_amount < 1000 AND ct.utm_medium IS NULL AND cc.trxn_id IS NULL AND cc.financial_type_id ='21' GROUP BY YEAR(receive_date), MONTH(receive_date) ;
+--------------------+---------------------+----------+------------+

YEAR(receive_date)MONTH(receive_date)count(*)contact_id

+--------------------+---------------------+----------+------------+

2015385512513
20154707508221
2015547310928236
2015682441482
2015768711010609
20158261505488
2015955510994232
20151011806457118
20151116644242513
201512223412790227
20161626010622642
20162318010618217
20163186212513
2016420073676714
201658814773173
2016617876579118
2016715894244217
20168133015012463
20169160014897317
20161015587262053
20161175505501

+--------------------+---------------------+----------+------------+

awight added a subscriber: awight.Nov 21 2016, 10:36 PM

Those are matching gifts accounts...

If we wanted, we could synthesize a trxn_id using a hash of soft credit donor name and amount, something like we do already to prevent duplicate imports of a check number.

@CCogdill_WMF
Would it be helpful to assign utm_* fields to the matching gifts contribution tracking records?

Hey @awight... sure, adding utm fields to matching gifts would be good, but the problem is bigger than that. As far as I can tell, any manual import into Civi is missing utm fields. The donations from the ActBlue emails we sent, for example, don't have a utm_source attached because I used the regular import template, so email doesn't get credit for those donations :(

I think we need to change the Civi import template so it accounts for these fields, and then teach Major Gifts best practices for assigning utm tracking.

@CCogdill_WMF
Yes! Thanks for the general fix, we should take that suggestion and add the fields to the import templates.

Matching gifts might even be a step beyond that--they're currently missing a lot of details and AIUI are automatically created by Civi.

@LeanneS how do you guys enter matching gifts? can talk on chat if it is hard to explain

Hi @CCogdill_WMF I did some digging into this - I'm not clear if there is a problem that needs solving here, and if so how to define it.

The contribution_tracking record only exists for entries that come in through online means. Almost all of the contributions without trxn_id were manually entered - per @Ejegg "77k entries missing trxn_id are also missing ct rows. only 7 contributions without trxn_id have ct rows". This is inline with spot checks that they appear to be manually entered by engage users.

The ones I spot checked seemed to be EFTs and it seemed valid to not have a trxn_id for EFTS - ie
civicrm/contact/view/contribution?action=view&reset=1&id=19742868&cid=498677&context=home
civicrm/contact/view/contribution?action=view&reset=1&id=17472457&cid=72&context=home

So, I guess the question is how would you like the data for these Engage-entered contributions to be different. We could add a hook to set some calculated values for them if we can define those. Alternatively we can make the trxn_id field compulsory for them - but I don't know what they would enter into it.

Hm, so I see your point about utms only being for online donations, @Eileen. We use utm tracking codes as our primary way to get donation breakdowns when we do our end of year summaries, so from an ease of use perspective, they are helpful in that way.

In the case of processors like Coinbase or ActBlue, we sometimes manually import data for online donations. These are what Pats likes to call *light* tech integrations, so we just import the data manually rather than making tech set up a new integration. In a situation like that, utm tracking seems more important.

OK - so what do you need me to do? Or are you pondering at the moment

From my perspective, I would like all donation sources to have utm sourcing
attached so it's easier for us to run reports. At the least, I'd like it to
be an optional field on the import template so online donation sources that
get manually imported can be attached to online donations... But I don't
know if that goes against our data policies. Just what would make my life
easier.

Eileenmcnaughton renamed this task from Engage import failing to import certain significant fields to [Spike] investigate contribution tracking data (was Engage import failing to import certain significant fields).Dec 20 2016, 3:50 AM

Thanks for the notes. If you need my use case, here's the query we usually run to get the medium breakdown each campaign:

SELECT utm_medium, count(cc.id), sum(cc.total_amount)

FROM civicrm.civicrm_contribution cc 

LEFT JOIN drupal.contribution_tracking ct
ON cc.id=ct.contribution_id 

WHERE cc.receive_date >= '2014-07-01' 

AND cc.receive_date < '2015-07-01'

AND cc.total_amount < 1000

GROUP BY ct.utm_medium;

I don't believe Civi distinguishes between banner, email, sidebar, etc, which is why we prefer to use contribution tracking. We have some imported methods into Civi that probably should have a utm_medium, like Coinbase or ActBlue, so it would be preferable to me to have utms in Civi, or these fields added for imported donations.

Maybe I could just write a fancier query...

This is the base query:

SELECT utm_medium, count(cc.id), sum(cc.total_amount)
FROM civicrm.civicrm_contribution cc
LEFT JOIN drupal.contribution_tracking ct ON cc.id=ct.contribution_id
WHERE cc.receive_date >= '2015-07-01'

AND cc.receive_date < '2016-07-01'
AND cc.total_amount < 1000

GROUP BY ct.utm_medium;

This query

SELECT COALESCE(utm_medium, ft.name) as `medium`, count(cc.id), sum(cc.total_amount)   FROM civicrm.civicrm_contribution cc   LEFT JOIN drupal.contribution_tracking ct ON cc.id=ct.contribution_id   LEFT JOIN civicrm.civicrm_financial_type ft ON cc.financial_type_id = ft.id WHERE cc.receive_date > '2017-01-01' AND cc.total_amount < 1000  GROUP BY `medium`;

Resolves the NULLs to reflect the financial type names. I'm not sure if it is a problem that there are so many 'Cash' -

Here is a slight adjustment which splits the 'cash' into 'Recurring' & 'Cash' (implied non-recurring-cash)

SELECT COALESCE(utm_medium, IF(cc.contribution_recur_id IS NULL, ft.name, 'Recurring')) as `medium`, count(cc.id), sum(cc.total_amount)  , 

cc.id as sample_id
FROM civicrm.civicrm_contribution cc  
LEFT JOIN drupal.contribution_tracking ct ON cc.id=ct.contribution_id  
LEFT JOIN civicrm.civicrm_financial_type ft ON cc.financial_type_id = ft.id
WHERE cc.receive_date > '2017-02-01' AND cc.total_amount < 1000 
GROUP BY `medium`;

@CCogdill_WMF let me know if those help. There seem to be a small number (e.g contribution_id = 22309234) where they have a contribution_tracking record & appear to be online but have no medium field

@Eileenmcnaughton
What kind of review can we provide? Are you just waiting for @CCogdill_WMF to try the latest queries?

@CCogdill_WMF have you had a chance to try the sql above?

Just a correction here. We were waiting for @LeanneS to do the import again. I think that has happened multiple times in the last month.

If so, @LeanneS and @CCogdill_WMF might need to sync up and check that the recent imports are doing what they need for @CCogdill_WMF

mmodell removed a subscriber: awight.Jun 22 2017, 9:33 PM
Aklapper removed Eileenmcnaughton as the assignee of this task.Jun 19 2020, 4:22 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)