Page MenuHomePhabricator

Orphan rectifier is silent about communication failures
Open, MediumPublic

Description

Is this the correct behavior? One or two maybe, but at some threshold we should notify with failmail.

Our decision to mail on any stderr might already be the right balance, let's review.

Event Timeline

awight created this task.Mar 22 2017, 9:31 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 22 2017, 9:31 PM
21:31:46  X Rectifying orphan: X
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]
21:31:46  X Clock at Confirm_CreditCard: 0 (X.1941)
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]
21:31:46  X Failed Validation. Aborting GET_ORDERSTATUS Array
21:31:46  (
21:31:46      [issuer_id] =} donate_interface-error-msg
21:31:46  )
21:31:46  
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]
21:31:46  X CVV Result: , AVS Result: 
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:            [31;40m[1m[error][0m
21:31:46  X Can't communicate or internal error: Failed data
21:31:46  validation
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]
21:31:46  X Result message: Can't communicate or internal
21:31:46  error: Failed data validation
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]
21:31:46  X : UNKNOWN INCOMPLETE: Can't communicate
21:31:46  or internal error: Failed data validation
21:31:46  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]
21:31:46  X Elapsed Time: 53
21:31:48  WD DonationInterface: orphans:globalcollect_gateway_trxn:                 [info]

I just noticed that the error is likely related to my recent upheaval of DI code: Failed data validation

There was a slow trickle of these failures, usually zero per job, but every so often there would be 2-4 log lines about it, until this job:

Ingenico_Orphan_Rectifier-2017-03-21_18-30-53

At which point we saw hundreds of errors per job.

I'll check which deployment this corresponds to.

Ejegg added a subscriber: Ejegg.Mar 23 2017, 9:33 PM

Some of the issues are because we're trying to rectify iDEAL donors. We shouldn't do that!

Change 344542 had a related patch set uploaded (by Awight):
[mediawiki/extensions/DonationInterface@deployment] Patch orphan rectifier to drop non-cc records

https://gerrit.wikimedia.org/r/344542

Change 344545 had a related patch set uploaded (by Awight):
[mediawiki/extensions/DonationInterface@master] Patch orphan rectifier to drop non-cc records

https://gerrit.wikimedia.org/r/344545

Change 344552 had a related patch set uploaded (by Awight):
[wikimedia/fundraising/crm/vendor@master] [HOTFIX] Patch orphan rectifier to drop non-cc records

https://gerrit.wikimedia.org/r/344552

Change 344552 merged by Ejegg:
[wikimedia/fundraising/crm/vendor@master] [HOTFIX] Patch orphan rectifier to drop non-cc records

https://gerrit.wikimedia.org/r/344552

Change 344545 merged by jenkins-bot:
[mediawiki/extensions/DonationInterface@master] Patch orphan rectifier to drop non-cc records

https://gerrit.wikimedia.org/r/344545

ggellerman triaged this task as Medium priority.Mar 28 2017, 9:36 PM
ggellerman moved this task from Triage to Q2 (Oct-Dec) 2020-2021 on the Fundraising-Backlog board.
mmodell removed a subscriber: awight.Jun 22 2017, 9:32 PM

Change 344542 abandoned by Awight:
Patch orphan rectifier to drop non-cc records

https://gerrit.wikimedia.org/r/344542