Page MenuHomePhabricator

Unusable emails addresses
Closed, ResolvedPublic2 Story Points

Description

There is a belief that there is a current task addressing this already.
However, we've just had an instance that made us think of it again.

How can we transform obvious errors in email address? Elliot mentioned that there is already something in place on the donation form to help with 1 character errors.

Possibilities include a Civi extension that lets you do a bulk search-and-replace on email domain names, e.g. gmail.con->gmail.com

Details

Related Gerrit Patches:
wikimedia/fundraising/crm : masterAdd email amender extension

Event Timeline

NNichols created this task.Aug 27 2019, 2:51 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 27 2019, 2:51 PM
Ejegg updated the task description. (Show Details)

I've been testing this extension today

https://civicrm.org/extensions/email-address-corrector

It hasn't been updated for a couple of years so I've put up a few patches to the writer - https://github.com/JohnFF/Email-Amender/pulls

But it seems it might meet out needs - you configure a list of changes to be made & then you can do a search & you have an action to correct those emails - it creates an activity which is a nice idea and it has some smart ideas about how to update the emails (top level domain, second level, combined).

It's possible to have the search action with or without automatic updating enabled. If automatic updating IS enabled it applies the fixes to any newly created emails (but does not touch emails being updated). I would need to check performance on this & we'd need to see how people feel about it. We can also turn it on & off easily if we want to experiment

Still to do

  • do a security audit of the code
  • try getting the tests included in our test suite.
  • modernise the way the code deals with settings
  • see how it performs on staging

The extension author will probably come online overnight as he is in the UK so I'll see whether he is encouraging PRs or whether i should go more down the forking route.

Change 537788 had a related patch set uploaded (by Eileen; owner: Eileen):
[wikimedia/fundraising/crm@master] Add email amender extension

https://gerrit.wikimedia.org/r/537788

I've been testing the extension on staging with 'live tweaking' enabled. There is a lot of noise in the numbers and each time I do a bunch of tests it seems a bit slower. I wonder why? Maybe we should re-boot staging & see if it makes any difference?

extension enabled but making no change on processing80 second(s)Average performance is 375 per minute
extension enabled but making no change on processing79 second(s)Average performance is 380 per minute
extension enabled but making no change on processing70 second(s)Average performance is 429 per minute
extension enabled but making no change on processing69 second(s)Average performance is 435 per minute
extension enabled and changing on processing70 second(s)Average performance is 429 per minute
extension enabled and changing on processing69 second(s)Average performance is 435 per minute
extension enabled and changing on processing67 second(s)Average performance is 448 per minute
extension fully disabled66 second(s)Average performance is 455 per minute
extension fully disabled67 second(s)Average performance is 448 per minute
extension fully disabled67 second(s)Average performance is 448 per minute
extension re-enabled & updating67 second(s)Average performance is 448 per minute
extension re-enabled & updating68 second(s)Average performance is 441 per minute
extension re-enabled & updating68 second(s)Average performance is 441 per minute
extension re-enabled & updating67 second(s)Average performance is 448 per minute
extension re-enabled & updating67 second(s)Average performance is 448 per minute
extension re-enabled & updating with static var67 second(s)Average performance is 448 per minute
extension re-enabled & updating with static var66 second(s)Average performance is 455 per minute
extension re-enabled, & updating with static var71 second(s)Average performance is 423 per minute
extension re-enabled, & updating with static var73 second(s)Average performance is 411 per minute

I've put up a patch for review & we have some options (@DStrine @Ejegg )

Basically I started from an extension written by John from Future First & did some work to clean it up /modernise it & add some apis. The extension stores a list of top level domains that should be 'fixed' - by default (con, cpm, couk and orguk) and second level domains - (a longer list of defaults but things like 'gmai' & 'gmal'). These can be tweaked and I'm imaging that could be managed by donor services? Depending what people think - it is configured here civicrm/emailamendersettings - that works on staging right now

The extension permits any of these configured alternatives to be 'fixed' and when they are fixed it creates an activity specifying the change that was made.

There are 3 ways the extension can be triggered to make a change

  1. Whenever a new email is created in the system. I did some performance testing and I don't think there is a performance reason not to do this although the data had a lot of noise in it (above). Note the email would be created first and then updated (with an activity created) , leaving a log trace. I think should happen quickly enough that it beats the deduper script.
  1. By an action from the search menu. I tried this, it worked, I updated 20k contacts this way but I did get a white screen with it completing in the background. I wasn't in love with the experience so I added
  1. an api action -
drush cvapi  EmailAmender.find_candidates rowCount=100

Will find the first 100 emails that 'need fixing'. and drush cvapi EmailAmender.batch_update rowCount=100 will fix them

and

drush cvapi  EmailAmender.batch_update rowCount=100

Will fix the first 100.

Note that if we deploy the patch & enable the extension we only get 1 of the above. The other 2 need to be enabled (which is just a flick of the switch / a job to be scheduled)

I also note that this would be more useful if we could dedupe by modified_date - the blocker is we don't have an index. I tested it & it was only a few minutes to add them

Change 537788 merged by jenkins-bot:
[wikimedia/fundraising/crm@master] Add email amender extension

https://gerrit.wikimedia.org/r/537788

Change 537788 abandoned by Eileen:
Add email amender extension

https://gerrit.wikimedia.org/r/537788

MBeat33 closed this task as Resolved.Tue, Oct 29, 9:23 PM

Sure thing, and many thanks, this will improve a lot of different workflows!

DannyS712 added a subscriber: DannyS712.

[batch] remove patch for review tag from resolved tasks