We've begun to see more Zendesk tickets that are bouncebacks from fundraising emails where the email address has an obvious typo. We fix these manually when possible, but it's labor intensive. Can we batch find-and-replace some of these obvious errors to the most common email domains, in order to make deduping exact matches even more effective?
For example in Civi there are:
309 donor email addresses end in @yaho.com
541 donor email addresses end in @gmial.com
If it's not technically a challenge, are there any reasons not to prune the really obvious typos? This would also have the benefits of increasing the size of the email list, and saving @CCogdill_WMF and the DS team some time.
Update: after chatting with @DStrine, here are the top 20 varmints:
%@gmai.com = 1916 records
%@gamil.com = 999
%@gmal.com = 554
%@gmial.com = 541
%@gmil.com = 435
%@gmail.co = 427
%@gmail.om = 344
%@yaho.com = 309
%@gmail.cm = 293
%@homail.com = 190 * domain actually goes somewhere, to Microsoft
%@hotmal.com = 160 * domain actually goes somewhere, to Microsoft
%@yahoo.co = 123
%@hotmil.com = 118
%@hotmail.co = 115
%@yhaoo.com = 109
%@yahoo.om = 107
%@yahoo.cm = 107
%@yhoo.com = 100
%@gmail.vom = 97
%@hotmail.cm and also %@yahooo.com = 89
For aol, @aol.cm = 35 @aol.om = 35 @aol.co = 48 and then there's a lower level of typos for lots of other domains.