Page MenuHomePhabricator

Contribution tracking clean up from small failstorm
Closed, ResolvedPublic

Description

We had the queues off tonight to add some columns to wmf_contribution_extra and smashpig.pending which caused significant replication lag while the queries propagated.

When the queues were turned back on we were getting failmails from contribution tracking queue consumer and payments-antifraud

@Dwisehaupt:

my thought is that some process checks the origin server for an id to process.
10:52 PM 
then it will check the read only replica to do more processing, but it may not be there yet.
we could push through a change to dns to point the read handle back at frdb1005 and remove any possible replication lag.

@Dwisehaupt pushed out a dns change to point the fundraising read only handle back at the origin server.

This fix made the queue consumers happy again but there are 10 damaged messages with possible missing contribution tracking info. I will dig into these tomorrow and see what we can get out of the logs

Searchkit of timeframe: https://civicrm.wikimedia.org/civicrm/admin/search#/edit/8550