Page MenuHomePhabricator

One-time import of target smart data into IBM -- how to do this?
Closed, ResolvedPublic

Description

Once the Target Smart data is mapped to contact records in Civi, we need to get the data into IBM so we can use it for email segmentation. Since this data isn't going to change over time, it seems like we just need a one-time import. What's the best way for us to do this?

Please let me know if we should consult Trilogy on this question.

Event Timeline

CCogdill_WMF created this task.

Change 534250 had a related patch set uploaded (by Eileen; owner: Eileen):
[wikimedia/fundraising/tools@master] Add new Silverpop fields to export.

https://gerrit.wikimedia.org/r/534250

@CCogdill_WMF we took a look at this today

We are going to try pushing up the targetsmart data as is linked to the contact id we send to silverpop.

This means we will won't be 'deduping' this data within the silverpop script. I did a few spot checks and in most, but not all, cases where a target smart contact has a dupe we have data for both dupes. We can revisit this but our thinking is also to see how long this 'simple variant' takes because if it's not too long then we can keep it running & it will fill out gaps as deduping proceeds. I'm guessing there will be deduper prioritisation of certain categories of donors.

We have 2 questions

  1. if we just add a bunch of fields like 'z_gender' to the end can we just add them & then you can pick up the mapping after without breakage / tight co-ordination
  2. Do you want the targetsmart values for fields or the labels they map to - e.g targetsmart - this is basically a choice between the 'value' column & the 'label' column in this screenshot

Screen Shot 2019-09-04 at 12.09.57 PM.png (816×1 px, 149 KB)

Awesome, thank you, @Eileenmcnaughton! FYI @KHaggard, please follow this task and we'll talk about field mapping once it's done.

We are going to try pushing up the targetsmart data as is linked to the contact id we send to silverpop.

Sounds good! Just to be sure, can you include email address in the export as well? That's our unique ID for silverpop and is required for imports.

  1. if we just add a bunch of fields like 'z_gender' to the end can we just add them & then you can pick up the mapping after without breakage / tight co-ordination

Yep!

Do you want the targetsmart values for fields or the labels they map to - e.g targetsmart - this is basically a choice between the 'value' column & the 'label' column in this screenshot

Labels, please :)

Generally, how will this work? Will it just be added to our nightly export for one night, or is it going to be a separate file we should grab from FTP?

We are going to add it to the nightly export code. It will just be extra fields appended in that csv for however long we leave it going - so it would have email by default as such

Okay! Would be great to do this on the Friday morning UTC export since we
aren't sending emails on Friday, just in case something goes amiss with the
import.

Since it will be the regular export file, we will need to coordinate the
day before this goes live to put our automated import on pause. Otherwise,
it will grab the file from FTP overnight and import the rows it's looking
for and discard the file before we can map the new fields. Just give us 24
hours notice.

@CCogdill_WMF so what time does your import pick up the Thurs one - in UTC?

OK so when I start work on Friday the Thurs one will have been uploaded & I'll deploy & see how it goes during the day ready for 'overnight pickup'

Change 534548 had a related patch set uploaded (by Eileen; owner: Eileen):
[wikimedia/fundraising/crm@master] Cleanup field options in field code

https://gerrit.wikimedia.org/r/534548

Change 534250 merged by jenkins-bot:
[wikimedia/fundraising/tools@master] Add new Silverpop fields to export.

https://gerrit.wikimedia.org/r/534250

I've deployed this and am currently running the export files (without the upload) to make sure it runs.

@CCogdill_WMF @KHaggard - starting from the next upload (until we turn it off) these extra fields will be present.

I did some checks on how duplicates might affect it. Checking for contacts where there was a net worth in civi for that email but which did not make it to the final view returned only 604 contacts and I got 1104 when checking for the same on net worth. In spot checks of the 604 I found that two of the 4 I checked were duplicates and two were not (husband & wife sharing same email). Based on this I don't think we should do more in the script to handle this.

It makes sense to me to get DS or MG to do a deduping run on some of the contacts with new target smart data - I tried this dedupe

Screen Shot 2019-09-06 at 11.26.58 AM.png (346×2 px, 92 KB)

And it seemed worth dedupe attention
@Ejegg @MBeat33 @LeanneS

select count(*) FROM silverpop_export_view s LEFT JOIN civicrm.civicrm_email e ON e.email = s.email     LEFT JOIN civicrm_value_1_prospect_5 v ON v.entity_id = e.contact_id WHERE v.estimated_net_worth_144 != '' AND s.z_estimated_net_worth = '' OR s.z_estimated_net_worth IS NULL ;


+----------+
| count(*) |
+----------+
|      604 |
+----------+
1 row in set (5 min 4.43 sec)
 select count(*) FROM silverpop_export_view s LEFT JOIN civicrm.civicrm_email e ON e.email = s.email     LEFT JOIN civicrm_value_1_prospect_5 v ON v.entity_id = e.contact_id WHERE v.income_range != '' AND s.z_income_range = '' OR s.z_income_range IS NULL ;


+----------+
| count(*) |
+----------+
|     1144 |
+----------+
1 row in set (4 min 14.49 sec)

Great, thanks @Eileenmcnaughton! I just told IBM to skip today's import so
I think the file should be around for Katie and me to import later. We'll
update this task with how it goes!

Change 534548 merged by jenkins-bot:
[wikimedia/fundraising/crm@master] Cleanup field options in field code

https://gerrit.wikimedia.org/r/534548

Hi @Eileenmcnaughton! Yes, it worked :) Thanks for the work you put into this!

@KHaggard ok - we are leaving those extra fields in for now as we will be importing some new gender data so don't remove them at your end just yet

Thanks, Eileen! Let us know when that new data gets added. We didn't set up
those new fields to import on a recurring basis because we didn't want to
slow down import runtime, so we'll need to do another manual import to add
them.

Le mar. 10 sept. 2019 à 10:13 PM, Eileenmcnaughton <
no-reply@phabricator.wikimedia.org> a écrit :

Eileenmcnaughton added a comment. View Task
https://phabricator.wikimedia.org/T231538

@KHaggard https://phabricator.wikimedia.org/p/KHaggard/ ok - we are
leaving those extra fields in for now as we will be importing some new
gender data so don't remove them at your end just yet

*TASK DETAIL*
https://phabricator.wikimedia.org/T231538

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Eileenmcnaughton
*Cc: *Ejegg, MBeat33, KHaggard, Ppena, Aklapper, LeanneS, CCogdill_WMF,
EBjune, DStrine

@CCogdill_WMF sure -I was pretty curious to see what the speed difference was at run time (at both ends) because there is also some deduping that would improve that data - although as I noted above it's not a huge number of affected contacts

@CCogdill_WMF we determined that it only takes 3 minutes longer at our end with these fields vs without so we are going to leave them in our nightly exports

@CCogdill_WMF the extra gender data is now in so will be available for you from tomorrow. As I mentioned we are going to leave the extra fields in on our end

Thanks, Eileen! We're going to set up a new recurring import tomorrow. I
think you can close this on your end.