Page MenuHomePhabricator

Investigate: Add more silverpop data to Civi (query names, query criteria, segment data)
Closed, ResolvedPublic0 Story Points


We currently pull down a bunch of mailing data from IBM, but not the details about what list we sent to or what criteria was used to build that list. We're going to do a lot more segmentation work this FY and IBM does not make this stuff easy to track. If we were to take our IBM query data that is associated with each mailing and import it into Civi, we could run mysql queries in our own databases and more easily track segments over time.

You can view sample queries in the IBM UI by going to Data > Queries and clicking into one of the queries you see there. Example criteria for this query:

Query name:
20190614_GreatBritain(GB)_English(en)_optinNULL_2017donor-2018opener_Email2-PhotoPageTest - Segment 4

Query criteria:
( (country is equal to GB
AND IsoLang is equal to en
AND latest_donation_date year is equal to 2017
AND latest_optin_response is not equal to one of the following (YES | NO))
AND LastOpenDate is between Jan 1, 2018 and Jun 30, 2019
AND highest_usd_amount is less than 500)
AND Segment is equal to 18

Ideally, we would save:

  1. Contact Source name (usually a Query or Contact List)
  2. All the query criteria--maybe just as a single text field, and just use a bunch of LIKE %X% statements when we need to query it?
  3. Ideally, save the Segment value as its own field. This is a nice-to-have.
  4. Maintain this at a per-mailing level, to go along with data in the civicrm_mailing_provider_data table

Because this segmentation work is a priority this FY, I'd love to see if this is something we can work on in Q1-Q2. It only dawned on me recently that this was a possibility, but it would make our work so much cleaner. Let me know if there's any clarification I can provide, and thanks!

Event Timeline

CCogdill_WMF triaged this task as High priority.Jul 5 2019, 9:27 PM
CCogdill_WMF created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 5 2019, 9:27 PM
CCogdill_WMF renamed this task from Add more silverpop data to Civi (query names, segment data) to Add more silverpop data to Civi (query names, query criteria, segment data).Jul 5 2019, 9:28 PM
CCogdill_WMF updated the task description. (Show Details)
DStrine renamed this task from Add more silverpop data to Civi (query names, query criteria, segment data) to Investigate: Add more silverpop data to Civi (query names, query criteria, segment data).Jul 8 2019, 7:45 PM

@CCogdill_WMF I'm digging into this but as I look through

it's not terribly clear to me which of the api will retrieve the data you want. I can try pulling the results from the various api calls and looking but I think it would be quicker to ask Brian - he is still the right person to check with?

@CCogdill_WMF @DStrine

I've looked through the documentation for the REST API and the xml api

Unfortunately the only additional data I could see how to retrieve is the list name.

Perhaps Brian can suggest something I couldn't find in their documentation? Failing that I can see how to CREATE a query so we could go down the path of creating queries in Civi & pushing to Silverpop if we have to.

I would also note that the Silver pop (Watson whatever they call it) API seems to have different 'generations' . The xml api seems to be the older generation, the rest the newer one and I'm unclear whether they are still developing the Rest api or have moved on again.

Hi @Eileenmcnaughton, thanks for investigating! I asked Brian and Eric at
Trilogy and they confirmed the API can't do this for us :( Their response,

*this will need to be scraped from the UI. I don’t see anything in the API
docs that stands out as a way to do this.*

*For example, on this Query page you could look in the #newQueryEnglish
element on the page for the description of the
You will know the list ID from the info you get back about the sent
mailing. *

Would it help to get on a call with them?

Yeah "scraping the UI" is not a scalable option. If we could get on a call, that might help.

Okay, starting a thread with David+Eileen+Trilogy.

@CCogdill_WMF what do you mean by "Contact Source name (usually a Query or Contact List)"?

IBM just likes to name its list types confusing (IMO) things. The two list
types we send to are Queries and Contact Lists. Not sure if it would help
if I walked someone through the UI to see what I mean, but happy to do that.

DStrine added a subscriber: Ejegg.Tue, Jul 23, 3:53 PM

@CCogdill_WMF a bunch of us looked at this and there are 2 options, each having drawbacks.

1 scraping data could be less complicated to setup but could break easily and often

2 @Eileenmcnaughton and @Ejegg can describe this better. We set something up where you create queries in civi and push them to IBM. We need workflow feedback from you on this. It's more complicated to set up in the beginning but will probably be more stable and reliable in the long run.

@Eileenmcnaughton ad @Ejegg can describe #2 in more detail here

from Brian

We login a post to the login url, full code attached. I think on pod4 you might login to instead.

Then we get mailing summary data from "action=displayHtmlBody&mailingId=XXXXX" and "action=mailingSummary&mailingId=XXXXX"

You can get the Query criteria from and pull what’s in the newQueryEnglish span and format however you need.

private async Task<HttpClient> GetAuthenticatedWebClientAsync()
            var httpClientHandler = new HttpClientHandler { UseCookies = true };
            var cookieContainer = new CookieContainer();
            httpClientHandler.CookieContainer = cookieContainer;
            var httpClient = new HttpClient(httpClientHandler);
            httpClient.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36");

            // Load the login page and extract a few key values
            var loginGetResponse = await httpClient.GetAsync(this.WebLoginUri);
            string loginHtml = await loginGetResponse.Content.ReadAsStringAsync();
            var ltRegex = new Regex(@"name=""lt"" value=""(?<ltValue>.*?)""", RegexOptions.IgnoreCase);
            var executionRegex = new Regex(@"name=""execution""\s+value=""(?<executionValue>.*?)""", RegexOptions.IgnoreCase);
            string lt = ltRegex.Match(loginHtml).Groups["ltValue"].Value;
            string execution = executionRegex.Match(loginHtml).Groups["executionValue"].Value;

            // Post the username and password to the action url to log in the user
            var loginRequest = new HttpRequestMessage(HttpMethod.Post, this.WebLoginUri);
            var formValues = new Dictionary<string, string>();
            formValues.Add("username", this.tenant.Username);
            formValues.Add("password", this.tenant.Password);
            formValues.Add("lt", lt);
            formValues.Add("execution", execution);
            formValues.Add("_eventId", "submit");
            loginRequest.Content = new FormUrlEncodedContent(formValues);

            var loginPostResponse = await httpClient.SendAsync(loginRequest);
            var loginPostHtml = await loginPostResponse.Content.ReadAsStringAsync();

            if (loginPostHtml.Contains("loginForm"))
                throw new Exception("Could not log in via web interface");

            return httpClient;

We chose a direction. We will be scraping the html to start. The implementation task is here: T230509

DStrine closed this task as Resolved.Wed, Aug 14, 7:27 PM