Page MenuHomePhabricator

Most recent DatabaseUpdate Import is missing data
Closed, ResolvedPublic

Description

This morning, the nightly import jobs ran successfully but the DatabaseUpdate summary is showing over a million rows of bad data:

Job ID: 198963192
Job Description: Adding and updating contacts in existing contact source, _all_Wikimedia
Job Type: Recurring Contact Source Import @ Tuesday, March 29, 2022 at 9:00:53 AM GMT
Job Status: Complete
Total Rows: 1037437
Total Valid Rows: 0
Total Invalid Addresses: 279
Total Duplicate Addresses: 0
Total Disallowed Addresses: 0
Total Bad Data: 1037158

Normally, bad data = 0 like March 28th's export:

Job ID: 198914969
Job Description: Adding and updating contacts in existing contact source, _all_Wikimedia
Job Type: Recurring Contact Source Import @ Monday, March 28, 2022 at 9:00:42 AM GMT
Job Status: Complete
Total Rows: 115905
Total Valid Rows: 115875
Total Invalid Addresses: 3
Total Duplicate Addresses: 0
Total Disallowed Addresses: 27
Total Bad Data: 0

The error code is: "Email Address Not Found in Row", so it seems like the file that was sent to FTP had over a million blank rows. The other files look fine, so this is only affecting the DatabaseUpdate-* file.

We have an email send going out tomorrow, so I'm setting this task at "unbreak now" priority. Thanks!

Event Timeline

KHaggard triaged this task as Unbreak Now! priority.Mar 29 2022, 3:54 PM

@KHaggard we just removed the voter_party column - did you remap it? Sorry if we didn't communicate - I guess I thought @Eileenmcnaughton had coordinated this with you.

We also started populating the 2022 stats columns with real figures instead of placeholder zeros that were still there until last night, but that shouldn't have done anything to the mapping.

Yeah I remapped last Friday @Ejegg But since the field was still there when I remapped, I had to click a checkbox to ignore the voter_party field. It was working well over the weekend.

I wonder if taking out the field from the Civi export caused Acoustic to act up because there was no longer a field present to ignore? That might be the case if that's the only thing that changed...

Would you be willing to resend today's files to the FTP again? I'll try remapping with the new file without voter_party in it and see if that works.

Thanks! If this works, I'll add it to my notes that we'll have to remap at the same time when the field is taken out of the export file. Let me know when the files are ready to go @Ejegg :)

OK @KHaggard, the new files are up. Note that our uploader sends all 4 at once (update, optout, matching gifts, and unsubscribe) so you may have to re-run imports for the others to clear them out on the acoustic side.

Got it, thanks for confirming that! By the way, I don't think we process the MatchingGifts-* file anymore, since we migrated the relevant MG fields to be in the DatabaseUpdate file, right?

I'll go ahead and run the optout and unsubscribes files too so they're cleared out of the FTP site.

@Ejegg So I just finished mapping and ran the DatabaseUpdate-* file and it looks back to normal :) guess that was the culprit! Sorry about that and thanks for helping me figure it out.

Job ID: 198980170
Job Description: Adding and updating contacts in existing contact source, _all_Wikimedia
Job Type: Recurring Contact Source Import @ Tuesday, March 29, 2022 at 5:36:04 PM GMT
Job Status: Complete
Total Rows: 1037437
Total Valid Rows: 1037033
Total Invalid Addresses: 52
Total Duplicate Addresses: 0
Total Disallowed Addresses: 352
Total Bad Data: 0

KHaggard claimed this task.

Resolving this now, thanks @Ejegg !