Page MenuHomePhabricator

Delete contacts deleted before 2017
Closed, ResolvedPublic2 Estimated Story Points

Description

Our merged contacts still have rows in the DB with 'is_deleted' and in other related tables - agree to & start deleting

There are 17 million deleted contacts in our database and 23 million not deleted records so deleted records likely make up half or our contact table, and wmf_donor table and make ‘some sort of dent’ in our activity_contact table. Note deleting these contacts will add rows to the log tables although the lesser indexing in the log tables means it’s not one for one. We would need to do some tests. The biggest chunk of deleted contacts is from 2016 (6 million or more than ⅓) so we could start by removing them and see how it affects other table sizes before deciding how far to go down this path. I believe this will improve our query speed as I suspect the poor indexing choices may be improved.

Event Timeline

Change 573424 had a related patch set uploaded (by Eileen; owner: Eileen):
[wikimedia/fundraising/crm@master] Add drush command for deleting deleted contacts

https://gerrit.wikimedia.org/r/573424

Only other thing to keep in mind would be issues with deleting the old emails from the log table which we send to the unsubscribe list.

I'm pretty sure that's NOT going to cause us problems, as @CCogdill_WMF has described the unsubscribe list as 'sticky', i.e. they'll stay on it once we put them there.

@Ejegg - yes and the emails should have been copied over to the merged-to contact unless they duplicated

Change 573424 merged by jenkins-bot:
[wikimedia/fundraising/crm@master] Add drush command for deleting deleted contacts

https://gerrit.wikimedia.org/r/573424

OK - I just started it - here are before stats

select is_deleted, count(*) FROM civicrm_contact GROUP BY is_deleted WITH ROLLUP;
+------------+----------+

is_deletedcount(*)

+------------+----------+

023762895
117320266
NULL41083161

+------------+----------+
3 rows in set (41.24 sec)

MariaDB [civicrm]> select count(*) FROM log_civicrm_contact;
+-----------+

count(*)

+-----------+

112635677

+-----------+
1 row in set (37.14 sec)

Eileenmcnaughton renamed this task from Delete contacts deleted more than x months / years ago to Delete contacts deleted before 2017.Mar 3 2020, 9:09 PM
Eileenmcnaughton reopened this task as Open.

I turned this off at 17:36 UTC as it was creating deadlocks after the Italy banners were turned on.

Turning job off now

Deleted contacts now:

1NULL23009743199
1201728413303260665
12018330514621614440
12019359866919621411
1202011004161233260
1NULL108478611233260
NULLNULL345126541233260

+------------+---------------------+----------+----------+

Next task - decide how many more to delete....

Eileenmcnaughton set the point value for this task to 2.
Eileenmcnaughton set Final Story Points to 2.