Page MenuHomePhabricator

Wmf-donor - new year fields
Closed, ResolvedPublic

Description

Last time we updated wmf_donor fields we removed everything before 2006 & created up to 2020-2021 but we are about to shoot past those fields (and I don't think we have calendar year 2021)

  • we should remove some older years when we do this - how many do we need to keep

https://civicrm.wikimedia.org/civicrm/admin/custom/group/field?reset=1&action=browse&gid=18

Related Objects

Event Timeline

Hi @Eileenmcnaughton - I was just asking about this today! I'll offer to write a commit for wmf_donor for this and submit it to you as a reviewer, if @DStrine is okay with that, as we use these fields all the time in Analytics. I would also like to propose adding a few additional fields that combine Endowment and Annual Fund totals, as this would cut down on the post-processing I do in our data cubes.

If you all are okay with me writing up a suggestion in line with how the current wmf_donor script is written and submitting it for review, I'd be happy to. I think it it would work with both team needs.

I thought this required a civi outage, right? It might be hard to schedule in the near term if this is so.

Just guessing - probably for the Civi UI update to ensure these fields are visible in the Civi UI, but for use in analytics and reporting, the update to the civicrm.wmf_donor table will get us where we need to be for most use cases, where we can work with this for the Japan campaign and ongoing for 99% of reporting

@EYener - there are a few steps involved - we have to load the triggers, create the fields, add to accoustic upload etc - so I won't be able to just +2 & deploy. But it definitely is a good time to propose additional fields if there are some you want & if that is easier to show us as a patch then go for it

Makes perfect sense, there are a lot of systems that these would have to go live in. I would just be interested in involvement up to the point of seeing new fields visible in civicrm.wmf_donor - we would use these as soon as they're available (my fault I didn't double check that FY2122 was available already).

I could propose a patch to the wmf_donor trigger file; is there more needed to surface the fields within the table itself?

@EYener that is a generated file - but in terms of communicating the extra fields you want it might be a good way to do it - since I'm sure that you want to speak sql to me

@Eileenmcnaughton and I talked about this today. In addition to the FY2122 rollup field, I would also be looking for:

  • foundation_total_fy1819
  • foundation_total_fy1920
  • foundation_total_fy2021
  • foundation_total_fy2122
  • foundation_largest_donation
  • foundation_number_donations
  • foundation_first_donation_date
  • foundation_first_donation_amount
  • foundation_last_donated
  • foundation_lifetime_usd_total
  • foundation_date_of_largest_donation
  • change_fy2021_fy2122 (or the equivalent change between FY2021 and FY2122 metric - AF only)
  • endowment_change_fy2021_fy2122 (or the equivalent change between FY2021 and FY2122 metric - Endowment only)
  • foundation_change_fy2021_fy2122 (or the equivalent change between FY2021 and FY2122 metric - combined funds)
  • foundation_change_fy1920_fy2021 (or the equivalent change between FY1920 and FY2021 metric - combined funds)
  • foundation_change_fy1819_fy1920 (or the equivalent change between FY1819 and FY1920 metric - combined funds)

I think that's it, fingers crossed!

@Dwisehaupt @Jgreen @Ejegg just flaggin that the desired scope of this is likely to have some impact on disk size (but hopefully not on performance )

Change 705777 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Remove function to remove old fields

https://gerrit.wikimedia.org/r/705777

Change 705787 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Remove code to install extensions

https://gerrit.wikimedia.org/r/705787

Change 705777 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Remove function to remove old fields

https://gerrit.wikimedia.org/r/705777

Change 705787 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Remove code to install extensions

https://gerrit.wikimedia.org/r/705787

Change 709582 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Move trigger info to own class

https://gerrit.wikimedia.org/r/709582

Change 709583 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Move wmf_donor fields to calculated class

https://gerrit.wikimedia.org/r/709583

Change 709627 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Move constants for roll up years over

https://gerrit.wikimedia.org/r/709627

Change 709628 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Add additional fields for totals

https://gerrit.wikimedia.org/r/709628

@EYener I want to query the importance of a couple of fields

foundation_first_donation_amount
foundation_date_of_largest_donation

  • You will note these fields don't have endowment only equivilents and the reason for that is they are expensive - ie you can do any number of MAX() pretty cheaply - but to get another value from the table when amount = MAX requires an extra join - not a very nice one at that. It's these extra joins that mean we can't calculate totals as we go for our most prolific donors.

The upshot is that those 2 specific fields are likely to have a performance impact which I don't expect from the others so I want to be sure you REALLY want them.

OK - it seems that we have exceeded the number of fields that can be indexed - here is the full list of fields & indexes excluding the in the above comment. We'll have to cut some fields and / or indexes

ALTER TABLE wmf_donor 
        ADD COLUMN `last_donation_date` datetime,
        ADD INDEX INDEX_last_donation_date ( last_donation_date ), 
        ADD COLUMN `endowment_last_donation_date` datetime,
        ADD INDEX INDEX_endowment_last_donation_date ( endowment_last_donation_date ), 
        ADD COLUMN `all_funds_last_donation_date` datetime,
        ADD INDEX INDEX_all_funds_last_donation_date ( all_funds_last_donation_date ), 
        ADD COLUMN `first_donation_date` datetime,
        ADD INDEX INDEX_first_donation_date ( first_donation_date ), 
        ADD COLUMN `endowment_first_donation_date` datetime,
        ADD INDEX INDEX_endowment_first_donation_date ( endowment_first_donation_date ), 
        ADD COLUMN `all_funds_first_donation_date` datetime,
        ADD INDEX INDEX_all_funds_first_donation_date ( all_funds_first_donation_date ), 
        ADD COLUMN `last_donation_currency` varchar(255),
        ADD INDEX INDEX_last_donation_currency ( last_donation_currency ), 
        ADD COLUMN `last_donation_amount` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_last_donation_amount ( last_donation_amount ), 
        ADD COLUMN `first_donation_usd` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_first_donation_usd ( first_donation_usd ), 
        ADD COLUMN `last_donation_usd` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_last_donation_usd ( last_donation_usd ), 
        ADD COLUMN `lifetime_usd_total` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_lifetime_usd_total ( lifetime_usd_total ), 
        ADD COLUMN `lifetime_including_endowment` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_lifetime_including_endowment ( lifetime_including_endowment ), 
        ADD COLUMN `endowment_lifetime_usd_total` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_lifetime_usd_total ( endowment_lifetime_usd_total ), 
        ADD COLUMN `number_donations` int DEFAULT '0',
        ADD INDEX INDEX_number_donations ( number_donations ), 
        ADD COLUMN `endowment_number_donations` int DEFAULT '0',
        ADD INDEX INDEX_endowment_number_donations ( endowment_number_donations ), 
        ADD COLUMN `all_funds_number_donations` int DEFAULT '0',
        ADD INDEX INDEX_all_funds_number_donations ( all_funds_number_donations ), 
        ADD COLUMN `largest_donation` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_largest_donation ( largest_donation ), 
        ADD COLUMN `endowment_largest_donation` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_largest_donation ( endowment_largest_donation ), 
        ADD COLUMN `all_funds_largest_donation` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_largest_donation ( all_funds_largest_donation ), 
        ADD COLUMN `date_of_largest_donation` datetime,
        ADD INDEX INDEX_date_of_largest_donation ( date_of_largest_donation ), 
        ADD COLUMN `total_2006_2007` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2006_2007 ( total_2006_2007 ), 
        ADD COLUMN `total_2006` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2006 ( total_2006 ), 
        ADD COLUMN `total_2007_2008` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2007_2008 ( total_2007_2008 ), 
        ADD COLUMN `total_2007` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2007 ( total_2007 ), 
        ADD COLUMN `total_2008_2009` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2008_2009 ( total_2008_2009 ), 
        ADD COLUMN `total_2008` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2008 ( total_2008 ), 
        ADD COLUMN `total_2009_2010` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2009_2010 ( total_2009_2010 ), 
        ADD COLUMN `total_2009` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2009 ( total_2009 ), 
        ADD COLUMN `total_2010_2011` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2010_2011 ( total_2010_2011 ), 
        ADD COLUMN `total_2010` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2010 ( total_2010 ), 
        ADD COLUMN `total_2011_2012` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2011_2012 ( total_2011_2012 ), 
        ADD COLUMN `total_2011` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2011 ( total_2011 ), 
        ADD COLUMN `total_2012_2013` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2012_2013 ( total_2012_2013 ), 
        ADD COLUMN `total_2012` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2012 ( total_2012 ), 
        ADD COLUMN `total_2013_2014` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2013_2014 ( total_2013_2014 ), 
        ADD COLUMN `total_2013` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2013 ( total_2013 ), 
        ADD COLUMN `total_2014_2015` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2014_2015 ( total_2014_2015 ), 
        ADD COLUMN `total_2014` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2014 ( total_2014 ), 
        ADD COLUMN `total_2015_2016` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2015_2016 ( total_2015_2016 ), 
        ADD COLUMN `total_2015` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2015 ( total_2015 ), 
        ADD COLUMN `total_2016_2017` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2016_2017 ( total_2016_2017 ), 
        ADD COLUMN `total_2016` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2016 ( total_2016 ), 
        ADD COLUMN `total_2017_2018` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2017_2018 ( total_2017_2018 ), 
        ADD COLUMN `total_2017` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2017 ( total_2017 ), 
        ADD COLUMN `change_2017_2018` double DEFAULT '0',
        ADD INDEX INDEX_change_2017_2018 ( change_2017_2018 ), 
        ADD COLUMN `total_2018_2019` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2018_2019 ( total_2018_2019 ), 
        ADD COLUMN `total_2018` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2018 ( total_2018 ), 
        ADD COLUMN `endowment_total_2018` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2018 ( endowment_total_2018 ), 
        ADD COLUMN `endowment_total_2018_2019` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2018_2019 ( endowment_total_2018_2019 ), 
        ADD COLUMN `all_funds_total_2018_2019` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_total_2018_2019 ( all_funds_total_2018_2019 ), 
        ADD COLUMN `all_funds_change_2018_2019` double DEFAULT '0',
        ADD INDEX INDEX_all_funds_change_2018_2019 ( all_funds_change_2018_2019 ), 
        ADD COLUMN `change_2018_2019` double DEFAULT '0',
        ADD INDEX INDEX_change_2018_2019 ( change_2018_2019 ), 
        ADD COLUMN `total_2019_2020` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2019_2020 ( total_2019_2020 ), 
        ADD COLUMN `total_2019` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2019 ( total_2019 ), 
        ADD COLUMN `endowment_total_2019` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2019 ( endowment_total_2019 ), 
        ADD COLUMN `endowment_total_2019_2020` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2019_2020 ( endowment_total_2019_2020 ), 
        ADD COLUMN `all_funds_total_2019_2020` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_total_2019_2020 ( all_funds_total_2019_2020 ), 
        ADD COLUMN `all_funds_change_2019_2020` double DEFAULT '0',
        ADD INDEX INDEX_all_funds_change_2019_2020 ( all_funds_change_2019_2020 ), 
        ADD COLUMN `change_2019_2020` double DEFAULT '0',
        ADD INDEX INDEX_change_2019_2020 ( change_2019_2020 ), 
        ADD COLUMN `total_2020_2021` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2020_2021 ( total_2020_2021 ), 
        ADD COLUMN `total_2020` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2020 ( total_2020 ), 
        ADD COLUMN `endowment_total_2020` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2020 ( endowment_total_2020 ), 
        ADD COLUMN `endowment_total_2020_2021` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2020_2021 ( endowment_total_2020_2021 ), 
        ADD COLUMN `all_funds_total_2020_2021` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_total_2020_2021 ( all_funds_total_2020_2021 ), 
        ADD COLUMN `all_funds_change_2020_2021` double DEFAULT '0',
        ADD INDEX INDEX_all_funds_change_2020_2021 ( all_funds_change_2020_2021 ), 
        ADD COLUMN `endowment_funds_change_2020_2021` double DEFAULT '0',
        ADD INDEX INDEX_endowment_funds_change_2020_2021 ( endowment_funds_change_2020_2021 ), 
        ADD COLUMN `change_2020_2021` double DEFAULT '0',
        ADD INDEX INDEX_change_2020_2021 ( change_2020_2021 ), 
        ADD COLUMN `total_2021_2022` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2021_2022 ( total_2021_2022 ), 
        ADD COLUMN `total_2021` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2021 ( total_2021 ), 
        ADD COLUMN `endowment_total_2021` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2021 ( endowment_total_2021 ), 
        ADD COLUMN `endowment_total_2021_2022` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2021_2022 ( endowment_total_2021_2022 ), 
        ADD COLUMN `all_funds_total_2021_2022` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_total_2021_2022 ( all_funds_total_2021_2022 ), 
        ADD COLUMN `all_funds_change_2021_2022` double DEFAULT '0',
        ADD INDEX INDEX_all_funds_change_2021_2022 ( all_funds_change_2021_2022 ), 
        ADD COLUMN `endowment_funds_change_2021_2022` double DEFAULT '0',
        ADD INDEX INDEX_endowment_funds_change_2021_2022 ( endowment_funds_change_2021_2022 ), 
        ADD COLUMN `change_2021_2022` double DEFAULT '0',
        ADD INDEX INDEX_change_2021_2022 ( change_2021_2022 ), 
        ADD COLUMN `total_2022_2023` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2022_2023 ( total_2022_2023 ), 
        ADD COLUMN `total_2022` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2022 ( total_2022 ), 
        ADD COLUMN `endowment_total_2022` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2022 ( endowment_total_2022 ), 
        ADD COLUMN `endowment_total_2022_2023` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2022_2023 ( endowment_total_2022_2023 ), 
        ADD COLUMN `all_funds_total_2022_2023` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_total_2022_2023 ( all_funds_total_2022_2023 ), 
        ADD COLUMN `all_funds_change_2022_2023` double DEFAULT '0',
        ADD INDEX INDEX_all_funds_change_2022_2023 ( all_funds_change_2022_2023 ), 
        ADD COLUMN `endowment_funds_change_2022_2023` double DEFAULT '0',
        ADD INDEX INDEX_endowment_funds_change_2022_2023 ( endowment_funds_change_2022_2023 ), 
        ADD COLUMN `change_2022_2023` double DEFAULT '0',
        ADD INDEX INDEX_change_2022_2023 ( change_2022_2023 ), 
        ADD COLUMN `total_2023_2024` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2023_2024 ( total_2023_2024 ), 
        ADD COLUMN `total_2023` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_total_2023 ( total_2023 ), 
        ADD COLUMN `endowment_total_2023` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2023 ( endowment_total_2023 ), 
        ADD COLUMN `endowment_total_2023_2024` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_endowment_total_2023_2024 ( endowment_total_2023_2024 ), 
        ADD COLUMN `all_funds_total_2023_2024` decimal(20,2) DEFAULT '0',
        ADD INDEX INDEX_all_funds_total_2023_2024 ( all_funds_total_2023_2024 ), 
        ADD COLUMN `all_funds_change_2023_2024` double DEFAULT '0',
        ADD INDEX INDEX_all_funds_change_2023_2024 ( all_funds_change_2023_2024 ), 
        ADD COLUMN `endowment_funds_change_2023_2024` double DEFAULT '0',
        ADD INDEX INDEX_endowment_funds_change_2023_2024 ( endowment_funds_change_2023_2024 ), 
        ADD COLUMN `change_2023_2024` double DEFAULT '0',
        ADD INDEX INDEX_change_2023_2024 ( change_2023_2024 ) [nativecode=1069 ** Too many keys specified; max 64 keys allowed]

Feels like year total and y-o-y change fields should go back a maximum of 10 years.

Change 710135 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Trigger update

https://gerrit.wikimedia.org/r/710135

A bunch of us talked in the analytics meeting today. These totals are useful in superset and in civi. I know this is starting to hit upper technical limits in various ways. A bunch of us were wondering if this all needs to be split into more tables and if that solves things. But in general all of this takes work and our windows for downtime are going to start diminishing. So we kind of need to make a plan.

@Eileenmcnaughton I'll be working Monday. Let's chat. We might need more meetings Tuesday or later to make a plan soon.

Change 710135 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Trigger update

https://gerrit.wikimedia.org/r/710135

Change 711020 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Extract code to get the sql to do an update

https://gerrit.wikimedia.org/r/711020

Change 711024 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/crm@master] Add api to fill wmf_donor table for 2021 donations

https://gerrit.wikimedia.org/r/711024

Change 709582 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Move trigger info to own class

https://gerrit.wikimedia.org/r/709582

Change 709583 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Move wmf_donor fields to calculated class

https://gerrit.wikimedia.org/r/709583

Change 709627 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Move constants for roll up years over

https://gerrit.wikimedia.org/r/709627

Change 711020 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Extract code to get the sql to do an update

https://gerrit.wikimedia.org/r/711020

Change 711246 had a related patch set uploaded (by Eileen; author: Eileen):

[wikimedia/fundraising/tools@master] Add annual total fields to everywhere except the final view

https://gerrit.wikimedia.org/r/711246

Change 711024 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Add api to fill wmf_donor table for 2021 donations

https://gerrit.wikimedia.org/r/711024

Change 711246 merged by jenkins-bot:

[wikimedia/fundraising/tools@master] Add annual total fields to everywhere except the final view

https://gerrit.wikimedia.org/r/711246

I've put this in done as the original task is complete now - however, additional requests came up while working through it and I have added a new task to cover those

Change 709628 abandoned by Eileen:

[wikimedia/fundraising/crm@master] Add additional fields for totals

Reason:

replaced

https://gerrit.wikimedia.org/r/709628