Page MenuHomePhabricator

Delay on TY email sends
Closed, ResolvedPublic

Description

Significant & Increasing delay on thank you emails sent via process control, starting @ ~ Mon May 27 13:35:19 UTC 2024.

Write-up of incident progression here (WIP)

Event Timeline

AKanji-WMF triaged this task as Unbreak Now! priority.Tue, May 28, 9:17 PM
AKanji-WMF moved this task from Triage to DRI Backlog on the Fundraising-Backlog board.

Change #1036779 had a related patch set uploaded (by Ejegg; author: Ejegg):

[wikimedia/fundraising/crm@master] Avoid extra calls to getTemplateDir

https://gerrit.wikimedia.org/r/1036779

Change #1036779 merged by jenkins-bot:

[wikimedia/fundraising/crm@master] Avoid extra calls to Smarty::getTemplateDir

https://gerrit.wikimedia.org/r/1036779

I think all of us dug into this in some way today but I feel we got to the bottom of it. In the end I found 3 fixes & Elliott found 1 and I think any of them are OK for a short-term fix (we have merged Elliott's & can deploy it tomorrow).

The status is

  1. We have something we can deploy tomorrow that will allow us switch back to *any* Smarty version without significant slow down
  2. There are multiple options for a permanent fix. I'm actively discussing them with Tim at the moment and have logged an upstream issue
  3. in my testing one of the patches we are looking at upstream addresses a previously unidentified performance bottleneck which will have been impacting us 'forever' and offers the hope we might wind up with significantly faster thank you emails.
  4. We have merged a patch that allows us to specify the Smarty version at run time - so we can try running the thank yous with different Smarty versions than the site is configured for

More details

I'm treating the gitlab issue https://lab.civicrm.org/dev/core/-/issues/5243 as the primary source for technical details so I'm going to write up some technical detail on how this relates to WMF but leave the main technical discussion for that issue.

This issue specifically affects the code which switches between languages. Hence we might not see much slow down after an English language campaign but would see more if it was a mix of languages.

When the preferred_language changes CiviCRM fires change of locale which triggers, amount other things, the configuration hook to be called - this hook is where the extensions register their Smarty template directories.

In CiviCRM 5.74 which we deployed last week there was a change to the way templates register their directories that started calling a function combination that undermines Smarty's tracking of directories - causing it to constantly reload them. There are a couple of approaches to fixing this under discussion on gitlab / chat

However, it also turns out that the whole premise of calling the calling the configuration hook at this point is questionable, especially for a site that does not implement different languages on the drupal side. I got the best performance increased by by-passing that code entirely and I want to follow up whether we can do that.

Logging

We all tried different things - that was fun. The profiling didn't seem to work well because it seems to get the slow down you have to run it long enough that the profile output is too big to load

The silver lining

I think this might be a blessing in disguise - the end result *might* be that our thank yous run significantly faster - locally I run 5 times faster after fixing the 'forever bug' compared to just fixing this issue
I did some work on using different logging to output function timings. I think it is promising but I'm currently thinking it might be something for Jack & I to look into in Hamburg

Ah, just saw @Eileenmcnaughton already submitted a few upstream patches. I've closed the duplicate one.

greg subscribed.

Send delay is back to normal and holding:
https://frmon.wikimedia.org/d/R5m3iU1Wk/queue?orgId=1&from=1716788935000&to=now&viewPanel=21

Moving to the Done column for review during End of Sprint. Thank you to everyone who jumped on this!

XenoRyet set Final Story Points to 8.