That would be great @Dwisehaupt, I think MAX_STATEMENT_TIME would be great for the research user (maybe 65 seconds just to be safe?). I just want to be sure - is there a way to check that this user is isolated to Superset queries? I'm almost sure it is, but want to double check. If 'yes,' then this would be a great solution.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 19 2021
Jan 16 2021
Jan 14 2021
Jan 12 2021
Really cool @Eileenmcnaughton & @DStrine . Sorry for belaboring the point....is this 450 day limit related to the created date? Or the date of the action taken? I can see it going either way and just want to make sure I understand. With the example that "if a mailing happened 460 days ago, but the open action happened yesterday" example, it seems like Acoustic would not store that because the storage limit is tied to the create date of 460 days ago - but I want to check.
Awesome, thank you @Eileenmcnaughton !
Jan 11 2021
Jan 7 2021
Hi all! CCing @AndyRussG - do we in Fundraising need IP data for either CentralNoticeImpressions or CentralNoticeBannerHistory? Having the geodata at the state/territory level is important for us for both reporting and diagnostic reasons, but the IP level is perhaps more micro than we need.
Jan 4 2021
@Jgreen would we be able to do this with a private repo?
Dec 21 2020
@KHaggard No issues here! Whatever @Eileenmcnaughton decides is best for ingesting this data, we can manage on our side. Thanks for checking!
Dec 18 2020
Dec 11 2020
Sounds good, @KHaggard. What segment was it? We can put a note on any of our reports that this is still pending.
That would be a better question for @KHaggard and team to prioritize in terms of email tasks. I'm not sure how large this send is or when the Email team would want this data back for analysis.
Dec 9 2020
Hey FR-Tech team! I'm not seeing this in the database when I search for it, so I think it might be an import issue from Acoustic.
Dec 4 2020
@DStrine Here is most of what you'd want to know about these recurring endow donors:
select * from analytics.all_donations_cube where contribution_recur_id is not null and financial_type = 'Endowment Gift';
Dec 2 2020
@DStrine Any chance we can chat about this during Civi Fortnightly? I'd also like to learn more / hear some details.
Oct 28 2020
After speaking with Melanie and @LeanneS I will be setting Melanie up for Superset access as well
Thanks @MNoorWMF ! So it sounds like we should be looking at the enUS segments only. Has this been for Email2 only? Or should we go back to email1?
Oct 27 2020
Tagging @RMurthy to assist
Sure thing we can help with this. Thanks for the ping @Eileenmcnaughton .
Oct 15 2020
@RLewis I've added a Superset account for Sheyna as well.
Sep 9 2020
Sep 4 2020
@KHaggard and I discussed this last week (or so) and I wanted to post some highlights here.
subscribing @spatton for awareness as there was discussion of RML reporting on a call yesterday
Sep 3 2020
To add context, the parent group I am looking at is group_id 701 from civicrm.civicrm_group_contact. It has child groups 685,704,723,724. All members of group 701 seem to be 'deleted'.
Aug 26 2020
Cool! The uniqueness check suggested sounds good and beneficial to me. I would like to double check and make sure that it captures the first event occurrence.
Aug 25 2020
Thank you @elukey, I can!
Thank you @elukey ! Unfortunately, I can't view that task. Would you be able to grant me access to https://phabricator.wikimedia.org/T252364 so that I can potentially plan on future use cases?
Aug 14 2020
Hi all! I found this task because I am experiencing this same problem for the Netherlands banners. As @Pcoombe indicated in the last comment (back in '17!) this seems to be happening with long modifiers. For example, the B2021_0708_nlNL_dsk_p1_lg_txt_twin1_optIn1.no-LP.rtbt.rtbt_ideal utm_source does not join to drupal.contribution_tracking, while the B2021_0708_enNL_m_p1_lg_txt_twin1_optIn1.no-LP.paypal does.
Aug 11 2020
This query takes 42.7 seconds on last run to execute in Superset:
Aug 5 2020
Aug 3 2020
Ah cool! Thank you for explaining @DStrine. @CCogdill_WMF your suggestion makes sense to me! One more question, what were the utm_ params before / currently?
I might need more context here to give input. When does a failed recur email get sent out? What information might we want to report on and analyze? IE, would this affect multiple mediums / campaigns and would we want that level of granularity?
Jul 24 2020
Hi @Dwisehaupt I just added the locking scripts to _insert.py files, tested, and they ran just fine so I merged your commit. I think we're good to close this for now and open a new tasks as needed. Thanks again!
Jul 20 2020
Jul 13 2020
@Jgreen I'd like to set up more alerting on frdb1003 for before we make this change so that my data cube testing and runs don't cause the server to lag behind master. Any ideas you have on this - or ideas on how I can leverage existing alerting - would be great!
Jul 9 2020
Added new cube (impressions_hourly) that runs once per week on Thursdays
Jul 8 2020
@Dwisehaupt logging has been added to:
/home/eyener/analytics_ad_hoc/endowment_recon_update.py
/home/eyener/analytics/email_cube_update.py
/home/eyener/analytics/banner_cube_raw_update.py
/home/eyener/analytics/email_stats_update.py
/home/eyener/analytics/email_components/email_components_update.py
Jun 30 2020
Hi @Eileenmcnaughton I just want to confirm that after the new silverpop job runs, the tables in the silverpop DB will be overwritten with the new fields, correct? Or will the new job write to a separate db / table structure?
Jun 24 2020
Jun 18 2020
Hi @Dwisehaupt how about 9PM UTC / 5PM Eastern / 2 PM Pactific on Monday? Alternately, the same time on Tuesday or Thursday would be preferable.
Good question - let me check and get back to you. Thanks @Dwisehaupt
Great, thanks! I will mark this as resolved (I think I can do that...)
The table above is in frdb1003: analytics_ad_hoc.targetsmart_export_june_2020. However, I ended up running a tunnel through R, creating a query, and writing that query output to a CSV on the file server via RStudio. Thanks to @Dwisehaupt for the help figuring out how to connect to a volume from a local application!
Hi @Eileenmcnaughton - I'm curious which of these fields you're referring to? What complication do you mean? Asking because I'm also working with similar metrics and would like to brainstorm!
Jun 16 2020
Thank you! It's working great now.
Hmm. @Jgreen , I'm not seeing it there. I added a table, so the DB is not empty, if that helps.
Thanks, Jeff!
Jun 15 2020
@LeanneS is able to access the list through Superset. I did an update to make the export fields more human-readable (see below). I'll do another glance through tomorrow to make sure I'm not missing anything.
Awesome! Two in one. Thanks!
Jun 10 2020
Jun 5 2020
I chatted with @LeanneS about it and it seems that keeping all communication preferences is okay for now (please correct me if this changes!)
Jun 4 2020
Adding query for transparency and any future needs:
Jun 1 2020
@Jgreen could you also add the time module, please?
May 28 2020
Hi @Eileenmcnaughton - I ran the CIDs (both deleted and kept) above through activities yesterday as a test, and I'm not getting the expected match.
May 27 2020
Hi @Jgreen I actually found a hack for current day (within the UI) by searching Stack Overflow. The time grain supports selecting "today:tomorrow" for current day reporting and I believe there are other work-arounds within the UI for month to date and year to date. It's not intuitive based on their selections but I don't believe this requires a code-level change - just more self-education!
May 21 2020
To add to the context and background on this, it is often important to in Fundraising to report on impression rate (impressions / pageviews) for a particular campaign to ensure that there are not technical issues during campaigns. Viewing this metric for a live campaign alongside other health metrics in a Fundraising dashboard would be of great value to our Creative team. @AndyRussG had a great example of monitoring impression rates by country during our last Big English - and we were just discussing this use case today - so I will let him discuss in more detail.
May 18 2020
May 13 2020
I've deleted my notebooks as well. Thank you!
May 8 2020
May 7 2020
May 6 2020
@Jgreen pointed out that we should be able to connect fr-superset to Hive without pulling data into Fundraising from the data lake itself - this might be a great option
@Jgreen what do you think of adding the new ChangeOwnPassword role to Beta and Gamma as well? I can see all new users needing to change their own passwords - it might be good to do this ahead of needing it.
That cadence works for me - thanks!
Mar 23 2020
Hi @elukey! Apologies - yes, notebook1003. This seems to have fixed the problem!
Mar 18 2020
@Jgreen we can close this as resolved. At least one data cube is running both inserts and updates on a schedule in the new ecosystem now.
My mistake - removed from the list. Thank you!
Very similar, with a few additions, to T236750.
Mar 17 2020
Mar 13 2020
Thank you, @Nuria ! It works seamlessly.
Hi @Nuria and all, we're ready to try a 'mock' data set as well. Can someone point me toward instructions on accessing and utilizing the staging environment so that I can get started with the upload? Thank you!
Mar 12 2020
@Jgreen can we merge this task with the ongoing umbrella task for infrastructure setup? https://phabricator.wikimedia.org/T238395
Mar 5 2020
Removed test cube from cron and set up a v1 production cube on a 10 minute schedule for inserts only. Next, I'll be working on updates on the whole cube, which would be more resource-intensive.
Feb 25 2020
It would be good to touch base, @Milimetric - I'll find time on your calendar for later next week.
Feb 24 2020
@Milimetric Likewise excited for collaboration! I agree that visualization is the final piece of this puzzle. In parallel to discussing a front end tool, I have been testing/working on creating and automating (via cron job) OLAP-style data cubes, based on the business units found within Fundraising, that would be able to interact with many visual front ends. I would be happy to discuss the full approach and hear more about best practices at any time!
@Jgreen do you have an estimated level of effort on installing Superset on the application server? We are trying to determine if this is something we should scope out and commit to being our visualization tool of choice before requesting. If the level of effort is relatively low, it would be great to have.
Feb 21 2020
I have a small test cube on dev_analytics pulling down new donation IDs, contact IDs, and utm_medium on a 10 minute interval via a python3 script in my home directory. I will alter the time table to a 1 hour interval to run overnight (through Saturday, possibly the weekend) for additional timing tests, and pick this up again Monday for more full-scale trial of larger data sets.
Feb 10 2020
Feb 6 2020
I'm curious, @Ottomata and @JAllemandou, if there is an elegant solution to dynamically filling partitions. IE, once a table is created with the partition types declared and a main location established, is the best way to fill the table to declare individual ALTER statements by day, hour, etc? Or is there a better way to accomplish this?
Thanks for the suggestion, @Ottomata - I'll take it back to the team and see what makes sense. Since we've been using the json_string format since 2016, it might make sense to have the 2019 data in the same format as well.
Thank you, @JAllemandou! I am new to Hive, and was not aware I could do this myself. It worked well, though, and I appreciate the resources. You can close this ticket; much appreciated.
Thank you, @JAllemandou! I've learned several new Hive features and commands.