Page MenuHomePhabricator

Estimate amount of not-yet Deduped Civi Records
Closed, ResolvedPublic

Description

As we talked about last week, can you estimate the amount of Civi records to be Dedupe?
This would be helpful to know, for scheduling contractors going forward.

Event Timeline

MBeat33 claimed this task.
MBeat33 added a subscriber: Eileenmcnaughton.

"answer is 800k+"

Thank you so much, @Eileenmcnaughton this is great information!

email:

Eileen McNaughton

12:00 AM (10 hours ago)

to me, fr-tech-failmail
Hi Michael,

I was looking to see how many dupes we have in the context of the silver pop script - & I was aware you were after that ego (it a phab I can’t find right now)
I grouped the below query by the year of their last donation date - answer is 800k+

Eileen

SELECT c1.id, min(c2.id) as lowest_contact_id_in_range, max(c2.id) as highest_contact_id_in_range, count(*), YEAR(w2.last_donation_date)

->                  FROM civicrm_email e1
->                           INNER JOIN civicrm_email e2 ON e2.id > e1.id
->                      AND e1.email = e2.email
->                      AND e1.is_primary = 1
->                      AND e2.is_primary = 1
->                           INNER JOIN civicrm_contact c1 ON c1.id = e1.contact_id AND c1.is_deleted = 0
->                           INNER JOIN civicrm_contact c2 ON c2.id = e2.contact_id AND c2.is_deleted = 0
->                           LEFT JOIN wmf_donor w2 ON c1.id = w2.entity_id
->                           LEFT JOIN civicrm_dedupe_exception de
->                                     ON de.contact_id1 = e2.contact_id and de.contact_id2 = e1.contact_id
->                           LEFT JOIN civicrm_dedupe_exception de2
->                                     ON de2.contact_id2 = e2.contact_id and de2.contact_id1 = e1.contact_id
->                  WHERE de.id IS NULL
->                    AND de2.id IS NULL
-> 
->                  GROUP BY YEAR(w2.last_donation_date);

+--------+----------------------------+-----------------------------+----------+-----------------------------+

idlowest_contact_id_in_rangehighest_contact_id_in_rangecount(*)YEAR(w2.last_donation_date)

+--------+----------------------------+-----------------------------+----------+-----------------------------+

98168148254247672112529NULL
571359692208969220812004
239242956341832638172006
1913826464341384637462007
1954431887824187952821952008
308106530264187856578062009
4456881709541881605180592010
1465111483942476357445222011
230565522442476087396532012
1408311621642476827749502013
18340213417424766081233552014
12924010586424766741266852015
22321310558424767961563682016
7086611076424767831238132017
15278112344424767811044972018
692283834942476819386192019
152921786234247682427652020

+--------+----------------------------+-----------------------------+----------+-----------------------------+