Page MenuHomePhabricator

Investigation: Scope Segmentation in Civi & Acoustic
Closed, ResolvedPublic

Description

Fundraising would like to implement donor segmentation in Civi & Acoustic. This task is to scope this work and start tasking out.

"All donor records in Civi/Acoustic have segmentation applied to them for Fundraising teams to use. We'd like this work completed before the beginning of next fiscal."

https://docs.google.com/document/d/1zuIXhhqXOQRga0NEOYHpxJH3vP8venaMSFWWe3VAhrI/edit

Related Objects

Event Timeline

XenoRyet triaged this task as High priority.May 1 2023, 4:41 PM

Joseph Mando will have some queries to base this off of and can help with data validation.

Broadly the tasks fall into

  1. create new custom field/s - we could add to wmf_donor or a new table. Only downside to WMF donor is that we are up against index limits - probably it would be better to move some of the older WMF_donor fields to a different table - but that might be a pain. OTOH we are due to create new fields & segments soon. There is a question as to whether we can just create one field (because the sub segment field implicitly determines the segment field) or whether for searching we should have both fields (we don't need all 4 as the others are labels
  1. create triggers to populate these - hopefully we can do this as we do the wmf_donor triggers - if they aren't too insanely expensive. Failing that it might need to run as a job.
  1. backfill
  1. update Acoustic export update

However, there are some things that are confusing to me about the definitions - ie

  • Major Gifts probably == Major Donors defined as 'cumulative giving in calendar year of 10k+' - I've commented on the sheet for clarification because I don't understand how you can have given that much in the current calendar year and be lapsed so perhaps ANY calendar year. In which case maybe we want to have a field of 'highest donor level achieved' as a field - and if so how long does that last?
  • what happens at the end of the calendar year - does everyone get rolled back to 'just a donor'
  • how do 'control' and 'test' get set?
  • the definition around 'mid tier' looks kinda expensive performance-wise & might cause database locks

Another question is whether any of the existing wmf_donor fields can go or be de-indexed to support this (ie we have a limit of indexes on the table & also reducing size would help performance)

We currently have fields from 2006 with indexes from total_2015_2016 (total_2015_2016).

We might be able to split it - eg. pre 2021 and post 2021 with pre going into a different table - but I don't know if that would be more or less 'locky'

@JMando shared a related superset dashboard in today's office hours. Offered to collab/support/meet when we're ready.

Major Gifts probably == Major Donors defined as 'cumulative giving in calendar year of 10k+' - I've commented on the sheet for clarification because I don't understand how you can have given that much in the current calendar year and be lapsed so perhaps ANY calendar year. In which case maybe we want to have a field of 'highest donor level achieved' as a field - and if so how long does that last?

Hi @Eileenmcnaughton, please refer to the selection criteria tab in the excel doc. Column E has criteria for each segment, these are slightly different to what you see in the definitions, I'll update today so both align. We'd like to base each segment on the time from when we're running the script, so active donors would be anyone who has given 12 months prior to the date the script runs. Happy to chat through any of these as some might be a little confusing.

how do 'control' and 'test' get set?

I've actually removed the test groups in the selection criteria tab. I think this is going to complicate things and the team can pull out test segments when building campaigns.

the definition around 'mid tier' looks kinda expensive performance-wise & might cause database locks

I've removed the segments related to third party data for now, I think this will make things more complicated than necessary for now.

Let me know if you want to jump on a call to talk through anything else with Sheetal and I.

I'll take another look at the spreadsheet but I just want to articulate that I'm hoping we can use triggers to update one record at a time based on existing fields as donations come in rather than running a script over the whole database each 'night' - it's quite possible the script would take several hours to run if we went that way, possibly requiring downtime.

Using "We'd like to base each segment on the time from when we're running the script, so active donors would be anyone who has given 12 months prior to the date the script runs." as opposed to the exisiting calendar or fiscal totals would mean we have to go the script route.

I just did a timing to see how long it would take to update a value for every contact in the database on staging - this is with no calculation - just updating them all to be the same value & it took 30 minutes.

UPDATE wmf_donor SET donor_segment_id = 2;
Query OK, 37297618 rows affected (28 min 26.875 sec)
Rows matched: 37297618 Changed: 37297618 Warnings: 0

Tech notes - I added donor_segment_id on staging - I had to de-index another field to do so as we are at the limit of indexed fields -- 55m to index the field

ALTER TABLE wmf_donor ADD COLUMN donor_segment_id int, DROP INDEX(INDEX_total_2016), ADD INDEX INDEX_donor_segment_id ( donor_segment_id );
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '(INDEX_total_2016), ADD INDEX INDEX_donor_segment_id ( donor_segment_id )' at line 1
MariaDB [dev_civicrm]> ALTER TABLE wmf_donor ADD COLUMN donor_segment_id int, DROP INDEX INDEX_total_2016, ADD INDEX INDEX_donor_segment_id ( donor_segment_id );

Query OK, 0 rows affected (55 min 20.262 sec)

Moving this to done since I think we are moving into sub-tasks for the 'doing'

XenoRyet set Final Story Points to 4.