Page MenuHomePhabricator

Test CiviCRM on new box
Closed, ResolvedPublic

Description

Done means

  • UI looks good (all set)
  • Queue consumers work well
  • any other process-control jobs that we can easily run also work well

Event Timeline

Testing process-control jobs.
Dedupe job ran fine
Can't run any queue stuff - can't connect to frqueue1001
Can't connect to Silverpop - probably just need to put new IP in allow list on Silverpop side.

The email that was in the email for the new server was already in Silvepop. I altered the entry to add the api access checkbox but have not re-tested

@Jgreen or @Dwisehaupt were either of you able to sort out the connection between this box and frqueue1001? Once that's working we can test all the queue consumers.

@Jgreen or @Dwisehaupt were either of you able to sort out the connection between this box and frqueue1001? Once that's working we can test all the queue consumers.

Yes, sorry, I guess I only pinged you in IRC about it. Should work now.

@Jgreen and @Dwisehaupt we can't test audit processing - the /var/spool/audit folder and its subfolders (adyen amazon astropay globalcollect paypal) don't exist on the new box, and once they do we want to somehow synchronize the contents with those on the old box.

Each of those processor-named folders has 'completed' and 'incoming' subfolders. When we download new files, we grab anything that the processor has that we don't have in either folder. Those come in to incoming, and we process them from there. Once we've found all the donations in a file, we move it to the 'completed' folder.

So starting with empty dirs on the new box will mean we download gigs and gigs of logs on the first run and then try to process lots of things we've already processed.

I have pulled the audit directory structure of civi1001 from frlog2001 and put it into /srv/archive/civi2001/audit on civi2001. The permissions should be all correct and ready for further testing. This data footprint was a bit larger than on civi1001 currently has on disk. I'm not sure if there is any data processing or culling that would cause the difference but just wanted you to be aware of that.

@Ejegg Looking a little more at the size differences, it appears there is a significant amount of data in the globalcollect/incoming dir that was captured to to the frlogger when the backup was done. I'm not sure how that will affect your testing but wanted you to be aware of it.

IIRC audit processing (or maybe just orphan-slaying?) also uses logs that we push over from the central logger, which is done by archive_sync on frlog*, but we hadn't set this up yet for civi2001. This is fixed.

@Ejegg how significant is it that civi1001:/var/spool/audit vs civi2001:/var/spool/audit are not in sync? Right now synchronizing those before/after audit job runs is a totally manual and there's no mechanism or procedure to deal with it.

@Jgreen it would be nicer if they shared that dir, but it might not be too bad. We'll just have to remember to do a manual sync if we have to fail over. The audit jobs run once a day, so we will have some time to do that.

@Jgreen it would be nicer if they shared that dir, but it might not be too bad. We'll just have to remember to do a manual sync if we have to fail over. The audit jobs run once a day, so we will have some time to do that.

@Ejegg I reconfigured the archive sync script to pull this dir from civi1001 and push it to civi2001. This will keep civi2001 in sync but we'll need to stop and/or reverse the archive sync to promote civi2001 to the active audit processor. Please coordinate with us when you're ready to test audit processing on civi2001 so we can adjust this stuff.

Jgreen changed the task status from Open to Stalled.Jul 7 2020, 5:04 PM
Jgreen moved this task from In Progress to Stalled on the fundraising-tech-ops board.