Target Smart data is being lost in the current merge process. There are ~7000 records containing this type of data (I searched on Net Worth) that have been deleted during the merge process. Not everyone had a Net Worth in the Target Smart update so this could be a higher number. I also don't know overall which ones were merged which way (donor services, MG&E, or through the script). Is there a way to restore this data to donor records? Or reverse the merging done on these records?
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Eileenmcnaughton | T232636 Restore target smart data lost on legacy merge screen | |||
Resolved | Eileenmcnaughton | T235450 Prevent target smart data from being lost on legacy merge screen |
Event Timeline
So having checked a couple -
19540112 was merged prior to the import
28299802 has data input at 2019-08-29 07:19:33 = merged 2019-08-30 20:35:16 - by a user - so that was after
I then did some mysql queries and none of the affected rows were merged by the dedupe script - they were all merged by people. This would also mean they were merged on the old screen I believe, as the Deduper uses the script code.
I think what we can say is that the old merge interface is not keeping this data reliably - and this is not a recent regression - but also there are a bunch where they were merged between when we sent the data to targetsmart & when we got it back & it was added on the old ID.
select count(*) FROM civicrm_value_1_prospect_5 INNER JOIN civicrm_contact c ON entity_id = c.id WHERE c.is_deleted=1 AND (estimated_net_worth_144 <> '' OR family_composition_173 <> '' OR income_range <> '' );
8733
Details
Select query
SELECT * FROM civicrm_value_1_prospect_5 INNER JOIN civicrm_contact c ON entity_id = c.id LEFT JOIN civicrm_activity_contact ac ON ac.contact_id = c.id LEFT JOIN civicrm_activity a ON a.id = ac.activity_id WHERE c.is_deleted=1 AND (estimated_net_worth_144 <> '' OR family_composition_173 <> '' OR income_range <> '' ) AND activity_type_id = 87
Less detail, incl merger
SELECT c.id as deleted_contact_id, merger.contact_id as merger_contact_id, YEAR(activity_date_time), MONTH(activity_date_time)
FROM civicrm_value_1_prospect_5 INNER JOIN civicrm_contact c ON entity_id = c.id
LEFT JOIN civicrm_activity_contact ac ON ac.contact_id = c.id AND ac.record_type_id =3
LEFT JOIN civicrm_activity a ON a.id = ac.activity_id
LEFT JOIN civicrm_activity_contact merger ON merger.activity_id = a.id AND merger.record_type_id = 2
WHERE c.is_deleted=1
AND (estimated_net_worth_144 <> '' OR family_composition_173 <> '' OR income_range <> '' ) AND activity_type_id = 87;
SELECT YEAR(activity_date_time) as year, MONTH(activity_date_time) as month, merger.contact_id as merger_contact_id, count(*) FROM civicrm_value_1_prospect_5 INNER JOIN civicrm_contact c ON entity_id = c.id LEFT JOIN civicrm_activity_contact ac ON ac.contact_id = c.id AND ac.record_type_id =3 LEFT JOIN civicrm_activity a ON a.id = ac.activity_id LEFT JOIN civicrm_activity_contact merger ON merger.activity_id = a.id AND merger.record_type_id = 2 WHERE c.is_deleted=1 AND (estimated_net_worth_144 <> '' OR family_composition_173 <> '' OR income_range <> '' ) AND activity_type_id = 87 GROUP BY year, month, merger_contact_id;
Here is how it breaks down
+------+-------+-------------------+----------+
year | month | merger_contact_id | count(*) |
+------+-------+-------------------+----------+
2017 | 6 | 1860117 | 1 |
2017 | 6 | 3445022 | 1 |
2017 | 6 | 11027170 | 1 |
2017 | 6 | 17215141 | 2 |
2017 | 6 | 20645648 | 58 |
2017 | 7 | 6353743 | 1 |
2017 | 7 | 18720235 | 1 |
2017 | 7 | 20645648 | 156 |
2018 | 2 | 5039628 | 1 |
2018 | 2 | 15757664 | 1 |
2018 | 2 | 20645648 | 1 |
2018 | 3 | 5039628 | 1 |
2018 | 3 | 6353743 | 4 |
2018 | 3 | 20645648 | 1 |
2018 | 3 | 20690347 | 2 |
2018 | 3 | 21041858 | 18 |
2018 | 3 | 26034374 | 3 |
2018 | 4 | 4846156 | 1 |
2018 | 4 | 5039628 | 1 |
2018 | 4 | 6353743 | 5 |
2018 | 4 | 10617387 | 22 |
2018 | 4 | 15124575 | 1 |
2018 | 4 | 20645648 | 2 |
2018 | 4 | 21041858 | 15 |
2018 | 4 | 26034374 | 5 |
2018 | 5 | 4846156 | 1 |
2018 | 5 | 5039628 | 1 |
2018 | 5 | 10617387 | 2 |
2018 | 5 | 15124575 | 1 |
2018 | 5 | 15757664 | 1 |
2018 | 5 | 20690347 | 3 |
2018 | 5 | 21041858 | 2 |
2018 | 5 | 26034374 | 1 |
2018 | 6 | 5039628 | 2 |
2018 | 6 | 6353743 | 12 |
2018 | 6 | 10617387 | 29 |
2018 | 6 | 20645648 | 1 |
2018 | 6 | 20690347 | 4 |
2018 | 6 | 21041858 | 19 |
2018 | 6 | 26034374 | 6 |
2018 | 7 | 6353743 | 5 |
2018 | 7 | 10617387 | 16 |
2018 | 7 | 20690347 | 3 |
2018 | 7 | 21041858 | 13 |
2018 | 8 | 5039628 | 1 |
2018 | 8 | 10617387 | 2 |
2018 | 8 | 11027170 | 1 |
2018 | 8 | 15124575 | 1 |
2018 | 8 | 21041858 | 1 |
2018 | 9 | 3686198 | 1 |
2018 | 9 | 10617387 | 1 |
2018 | 9 | 11027170 | 1 |
2018 | 9 | 15124575 | 2 |
2018 | 9 | 20645648 | 1 |
2018 | 10 | 4846156 | 1 |
2018 | 10 | 5039628 | 3 |
2018 | 10 | 10617387 | 9 |
2018 | 10 | 15124575 | 1 |
2018 | 10 | 20645648 | 1 |
2018 | 10 | 20690347 | 1 |
2018 | 10 | 21041858 | 1 |
2018 | 10 | 21788182 | 1 |
2018 | 11 | 4846156 | 2 |
2018 | 11 | 5039628 | 2 |
2018 | 11 | 6353743 | 3 |
2018 | 11 | 10617387 | 3 |
2018 | 11 | 11027170 | 1 |
2018 | 11 | 15757664 | 6 |
2018 | 11 | 20645648 | 5 |
2018 | 11 | 21041858 | 1 |
2018 | 12 | 4846156 | 13 |
2018 | 12 | 6353743 | 2 |
2018 | 12 | 10617387 | 4 |
2018 | 12 | 11027170 | 1 |
2018 | 12 | 15757664 | 1 |
2018 | 12 | 20645648 | 1 |
2018 | 12 | 20690347 | 4 |
2018 | 12 | 26034374 | 1 |
2019 | 1 | 5039628 | 2 |
2019 | 1 | 10617387 | 4 |
2019 | 1 | 15757664 | 3 |
2019 | 1 | 21041858 | 2 |
2019 | 2 | 10617387 | 9 |
2019 | 3 | 5039628 | 1 |
2019 | 3 | 6353743 | 2 |
2019 | 3 | 10617387 | 2 |
2019 | 3 | 15757664 | 1 |
2019 | 3 | 20645648 | 1 |
2019 | 3 | 26034374 | 9 |
2019 | 4 | 3686198 | 7 |
2019 | 4 | 10617387 | 2 |
2019 | 4 | 11027170 | 1 |
2019 | 4 | 15757664 | 1 |
2019 | 4 | 21041858 | 2 |
2019 | 4 | 27804765 | 1 |
2019 | 4 | 29132487 | 1 |
2019 | 4 | 29146177 | 1 |
2019 | 4 | 30996756 | 1 |
2019 | 4 | 31779349 | 10 |
2019 | 5 | 3686198 | 189 |
2019 | 5 | 5039628 | 1 |
2019 | 5 | 10617387 | 10 |
2019 | 5 | 20645648 | 21 |
2019 | 5 | 21041858 | 1 |
2019 | 5 | 26034374 | 3 |
2019 | 5 | 31779349 | 31 |
2019 | 6 | 3686198 | 23 |
2019 | 6 | 5039628 | 3 |
2019 | 6 | 6353743 | 2 |
2019 | 6 | 10617387 | 39 |
2019 | 6 | 11027170 | 2 |
2019 | 6 | 15124575 | 1 |
2019 | 6 | 20645648 | 38 |
2019 | 6 | 21041858 | 4 |
2019 | 6 | 26034374 | 37 |
2019 | 6 | 31779349 | 327 |
2019 | 7 | 3686198 | 431 |
2019 | 7 | 5039628 | 132 |
2019 | 7 | 6353743 | 1 |
2019 | 7 | 10617387 | 70 |
2019 | 7 | 10827232 | 41 |
2019 | 7 | 11027170 | 1 |
2019 | 7 | 13166837 | 1 |
2019 | 7 | 15124575 | 2 |
2019 | 7 | 15757664 | 1 |
2019 | 7 | 20267747 | 1 |
2019 | 7 | 20645648 | 161 |
2019 | 7 | 21041858 | 317 |
2019 | 7 | 21788182 | 1 |
2019 | 7 | 22868850 | 1 |
2019 | 7 | 26034374 | 104 |
2019 | 7 | 28019671 | 2 |
2019 | 7 | 31779349 | 1504 |
2019 | 7 | 31923926 | 1 |
2019 | 7 | 31927941 | 1 |
2019 | 7 | 34488457 | 500 |
2019 | 7 | 34488458 | 124 |
2019 | 8 | 3686198 | 154 |
2019 | 8 | 5039628 | 579 |
2019 | 8 | 6353743 | 250 |
2019 | 8 | 10617387 | 298 |
2019 | 8 | 10827232 | 3 |
2019 | 8 | 13166837 | 1 |
2019 | 8 | 20645648 | 15 |
2019 | 8 | 21041858 | 247 |
2019 | 8 | 26034374 | 3 |
2019 | 8 | 28019671 | 6 |
2019 | 8 | 31779349 | 1659 |
2019 | 8 | 34488457 | 399 |
2019 | 8 | 34488458 | 50 |
2019 | 9 | 3686198 | 28 |
2019 | 9 | 6353743 | 6 |
2019 | 9 | 10617387 | 46 |
2019 | 9 | 20645648 | 1 |
2019 | 9 | 21041858 | 19 |
2019 | 9 | 31779349 | 208 |
2019 | 9 | 34488457 | 2 |
2019 | 9 | 34488458 | 2 |
@Eileenmcnaughton Do you have an idea of how many records with the Target Smart data were deleted?
@NNichols so 7k if you go right back but around $2k recently - but in many cases the data was also on the kept contact or has since been added toit
Any sense on how we restore this data? Michael has a phab task in asking that all prospect information be automatically merged which will save the data from this in the future. However, we are starting to put together lists of donors we (Major Gifts & Endowment) want to suppress from fundraising emails by end of next week. We're using the Net Worth data points as criteria for this list.
@NNichols I just ran an update which ensured that any prospecting data was moved to the merged-to contact where there was prospecting data on the deleted contact but not the merged to contact. I think this is probably the majority of them & most of those remaining have valid prospecting data on both but I'll do some more queries
Extended (but sadly slow) query for finding any others
SELECT v.entity_id as deleted_contact, v2.entity_id as kept_contact, v.estimated_net_worth_144, v.income_range, v.biography_66, v.charitable_contributions_decile, v.disc_income_decile, v.family_composition_173, v.voter_party, v.occupation_175 FROM civicrm_value_1_prospect_5 v INNER JOIN civicrm_contact deleted_contact ON v.entity_id = deleted_contact.id AND deleted_contact.is_deleted = 1 INNER JOIN civicrm_activity_contact ac ON ac.contact_id = deleted_contact.id AND ac.record_type_id = 3 INNER JOIN civicrm_activity a ON a.id = ac.activity_id AND activity_type_id = 87 INNER JOIN civicrm_activity_contact merger ON merger.activity_id = a.id AND merger.record_type_id = 2 AND activity_type_id = 87 LEFT JOIN civicrm_value_1_prospect_5 v2 ON v2.entity_id = merger.id WHERE (v.estimated_net_worth_144 IS NOT NULL AND v.estimated_net_worth_144 != v2.estimated_net_worth_144) OR (v.income_range IS NOT NULL AND v.income_range != v2.income_range) OR (v.biography_66 IS NOT NULL AND v.biography_66 != v2.biography_66) OR (v.charitable_contributions_decile IS NOT NULL AND v.charitable_contributions_decile != v2.charitable_contributions_decile) OR (v.disc_income_decile IS NOT NULL AND v.disc_income_decile != v2.disc_income_decile) OR (v.family_composition_173 IS NOT NULL AND v.family_composition_173 != v2.family_composition_173) OR (v.voter_party IS NOT NULL AND v.voter_party != v2.voter_party) OR (v.occupation_175 IS NOT NULL AND v.occupation_175 != v2.occupation_175) LIMIT 10;
@LeanneS @NNichols the scope of this task is to restore Targetsmart data where the checkboxes may have been missed during merge. My current thinking is to find all deleted contacts with Targetsmart data where the merged to contact does not have TS data & move the data across. I previously did a restore for the 'low-hanging fruit' but there are more for a second round.
However I have 2 questions
- if some fields have been moved across but not others (e.g income but not charitable decile) should I move across the not-moved field. The alternative is to assume that indicates a deliberate decision was made & to skip
- Which of the fields in the prospect tab should I recover. Only targetsmart or some others too. The fields are:
TargetSmart fields
Field | |
T | Income_Range |
T | Estimated_Net_Worth |
T | Charitable_Contributions_Decile |
T | Disc_Income_Decile |
T | Family_Composition |
T | Voter_Party |
T | Occupation |
Not TS fields
Reviewed |
Stage |
Capacity |
Steward |
Solicitor |
Affinity |
Philanthropic_Interests |
Philanthropic_History |
Biography |
Interests |
Origin |
On_Hold |
Subject_Area_Interest |
University_Affiliation |
next_step |
Notes |
Board_Affiliations |
Endowment_Stage |
PG_Stage |
Survey_Responses |
I checked again & there are
- 443 rows where there is targetsmart data in the deleted prospect record & no prospect record
- 498 rows with ts data in the deleted record & a prospect record on the kept contact
Query 1 - no record (note the v2.id condition is hacked onto the generic query hence some stuff could go but it does no harm
SELECT v.entity_id as deleted_contact, v2.entity_id as kept_contact_id, v.estimated_net_worth_144 as lost_net_worth, v2.estimated_net_worth_144, v.income_range as lost_income_range, v2.income_range, v.charitable_contributions_decile as lost_charitible_decile, v2.charitable_contributions_decile, v.disc_income_decile as lost_income_decile, v2.disc_income_decile, v.family_composition_173 as lost_family, v2.family_composition_173, v.voter_party as lost_party, v.voter_party, v.occupation_175 as lost_occupation, v.occupation_175 FROM civicrm_value_1_prospect_5 v INNER JOIN civicrm_contact c ON entity_id = c.id INNER JOIN civicrm_activity_contact ac ON ac.contact_id = c.id AND ac.record_type_id =3 INNER JOIN civicrm_activity a ON a.id = ac.activity_id AND activity_type_id = 87 INNER JOIN civicrm_activity_contact mergedto USE INDEX(index_record_type) ON mergedto.activity_id = a.id AND mergedto.record_type_id = 1 LEFT JOIN civicrm_value_1_prospect_5 v2 ON mergedto.contact_id = v2.entity_id WHERE c.is_deleted=1 AND v2.id IS NULL AND ( (v.estimated_net_worth_144 IS NOT NULL AND v.estimated_net_worth_144 != '') AND (v2.estimated_net_worth_144 = '' OR v2.estimated_net_worth_144 IS NULL) OR ((v.income_range IS NOT NULL AND v.income_range != '') AND (v2.income_range IS NULL OR v2.income_range = '')) OR ((v.charitable_contributions_decile IS NOT NULL AND v.charitable_contributions_decile != '') AND (v2.charitable_contributions_decile IS NULL OR v2.charitable_contributions_decile = '')) OR ((v.disc_income_decile IS NOT NULL AND v.disc_income_decile != '') AND (v2.disc_income_decile IS NULL OR v2.disc_income_decile = '')) OR ((v.family_composition_173 IS NOT NULL AND v.family_composition_173 != '') AND (v2.family_composition_173 IS NULL OR v2.family_composition_173 = '')) OR ((v.voter_party IS NOT NULL AND v.voter_party != '') AND (v2.voter_party IS NULL OR v2.voter_party = '')) OR ((v.occupation_175 IS NOT NULL AND v.occupation_175 != '') AND (v2.occupation_175 IS NULL OR v2.occupation_175 = '')) )
Query - with a different id
SELECT v.entity_id as deleted_contact, v2.entity_id as kept_contact_id, v.estimated_net_worth_144 as lost_net_worth, v2.estimated_net_worth_144, v.income_range as lost_income_range, v2.income_range, v.charitable_contributions_decile as lost_charitible_decile, v2.charitable_contributions_decile, v.disc_income_decile as lost_income_decile, v2.disc_income_decile, v.family_composition_173 as lost_family, v2.family_composition_173, v.voter_party as lost_party, v.voter_party, v.occupation_175 as lost_occupation, v.occupation_175 FROM civicrm_value_1_prospect_5 v INNER JOIN civicrm_contact c ON entity_id = c.id INNER JOIN civicrm_activity_contact ac ON ac.contact_id = c.id AND ac.record_type_id =3 INNER JOIN civicrm_activity a ON a.id = ac.activity_id AND activity_type_id = 87 INNER JOIN civicrm_activity_contact mergedto USE INDEX(index_record_type) ON mergedto.activity_id = a.id AND mergedto.record_type_id = 1 LEFT JOIN civicrm_value_1_prospect_5 v2 ON mergedto.contact_id = v2.entity_id WHERE c.is_deleted=1 AND v2.id IS NOT NULL AND ( (v.estimated_net_worth_144 IS NOT NULL AND v.estimated_net_worth_144 != '') AND (v2.estimated_net_worth_144 = '' OR v2.estimated_net_worth_144 IS NULL) OR ((v.income_range IS NOT NULL AND v.income_range != '') AND (v2.income_range IS NULL OR v2.income_range = '')) OR ((v.charitable_contributions_decile IS NOT NULL AND v.charitable_contributions_decile != '') AND (v2.charitable_contributions_decile IS NULL OR v2.charitable_contributions_decile = '')) OR ((v.disc_income_decile IS NOT NULL AND v.disc_income_decile != '') AND (v2.disc_income_decile IS NULL OR v2.disc_income_decile = '')) OR ((v.family_composition_173 IS NOT NULL AND v.family_composition_173 != '') AND (v2.family_composition_173 IS NULL OR v2.family_composition_173 = '')) OR ((v.voter_party IS NOT NULL AND v.voter_party != '') AND (v2.voter_party IS NULL OR v2.voter_party = '')) OR ((v.occupation_175 IS NOT NULL AND v.occupation_175 != '') AND (v2.occupation_175 IS NULL OR v2.occupation_175 = '')) ) ;
@NNichols @LeanneS I previously restored the lion's share of these but left it open as there were some trickier ones. On revisiting I see that there are
- 443 rows where the deleted contact has a prospect record and the remaining contact has no prospect record at all. This is the type of record we previously restored & we can fairly easily do so.
- 498 records which have prospect data but there is data on the deleted one that is not on the new one. It's hard with these to be sure whether it's appropriate to restore the data since in some cases a choice might have been made between a 'set' of data - e.g 493554 was merged to 334574 and the net worth went from L (high) to C (low) in the process so it probably doesn't make sense to recover their charitable decile.
I'm inclined to think that it doesn't make sense to record the 498 records in set 2 as there is a return on effort issue for what might not be great data. However, there are a small handful where the lost data says they were high-ish net worth individuals - maybe you want to manually check some or all of these. They are sorted by net_worth so by the time you get to K you are down to $1million
deleted_contact kept_contact_id lost_net_worth
147173 93272 N
248558 2291310 M
23214605 3440517 M
31740878 5621620 L
598171 210473 L
633155 1127867 L
5981874 2043342 L
9803561 15631782 L
10443509 1376192 L
2376682 27982838 L
29742092 3187832 L
14129499 2363058 L
30758007 13963971 L
16478090 1703054 L
17051388 3497599 L
18483 435975 L
29729759 14625804 L
9697452 rMW32809209463b L
776436 7977417 L
29741978 6058530 L
2841451 1004191 L
6638664 3192541 K
8838784 1640623 K
11042322 3192541 K
4473469 333347 K
7027529 31012245 K
23470010 28576891 K
796967 28883022 K
521850 26501897 K
18586976 13887660 K
23619321 28892656 K
30580954 2229229 K
@LeanneS @NNichols - I've restored what I think makes sense to restore - see https://phabricator.wikimedia.org/T232636#5818528 for what is not included in that