- We need to test turning on logging on Staging to see how it performs.
- I need to provide a mysql file to Jeff to create the triggers & log tables
- Jeff to run sql, determining how much extra disc is used
- general click around by me
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Enable setting to specify logging is handled offline | wikimedia/fundraising/crm | master | +19 -1 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Eileenmcnaughton | T77938 [epic] Dedupe CiviCRM | |||
Resolved | None | T111702 [epic] Dedupe exact matches | |||
Duplicate | None | T124695 Civi: can the Change Log be used to recover contribs that have been inadvertently deleted? | |||
Resolved | Eileenmcnaughton | T117630 Dedupe merge failed to relink a contribution? | |||
Resolved | None | T117466 Q3 GOALS! (January-March) Keep at top of Q3 column | |||
Resolved | None | T111704 [epic] Make deduping reversible | |||
Resolved | Eileenmcnaughton | T127133 Make dedupe reversible step 1 of 5 test turning on logging on Staging to see how it performs. |
Event Timeline
@Jgreen I have uploaded the sql that would turn logging on on staging to my sandbox branch. This is not where I would expect that sql would stay but I expect to tweak it over time so I figured getting it into version control was as good a way as any to share the file.
When you are available I'd like to try running that on staging. I am aware it will create a lot of extra data and that we need to understand how manageable that is. I don't know whether it will complete without problems or not but the section I ran locally was OK.
Could you check disk before & after running it?
Change 273179 had a related patch set uploaded (by Eileen):
Add rudimentary performance timing drush command
My expectation is that the default table type (archive) won't be acceptable. However, I'd like to turn it on with archive (using this sql http://git.wikimedia.org/tree/wikimedia%2Ffundraising%2Fcrm%2Fcivicrm.git/refs%2Fheads%2Fsandbox%2Feileen%2Fstaging )
& see
- how much disk space is taken
- how long the performance script (https://gerrit.wikimedia.org/r/273179) takes to run
(note - I could record disk by just doing df before & after and getting the difference or there might be a better way?)
Having done that I would try converting the tables (selectively) to INNODB & adding indexes on id, contact_id/entity_id, connection_id, user_id (possibly) and action_date (possibly also action) and do the same checks.
Other people have looked at TokuDB & INNODB compressed row format : https://issues.civicrm.org/jira/browse/CRM-18104
Ok I loaded up the trigger.mysql query batch, seems to have run smoothly. After that the new log_civicrm database consumes 4.3GB on disk.
Hi @Jgreen I can't see the triggers! I checked & the trigger data didn't seem to be there so I did SHOW TRIGGERS & didn't see anything in dev_civicrm
Yeah, you're right. Looking at trigger.mysql the create trigger syntax is
bad, which I guess caused it to silently fail on load. See:
https://dev.mysql.com/doc/refman/5.5/en/trigger-syntax.html
jg
OK the delimiters aren't in the sql - it should be more like
delimiter //
CREATE TRIGGER civicrm_contribution_after_insert after insert ON civicrm_contribution
FOR EACH ROW
BEGIN
IF ( @civicrm_disable_logging IS NULL OR @civicrm_disable_logging = 0 )
THEN
INSERT INTO log_civicrm_contribution (id, contact_id, financial_type_id, contribution_page_id, payment_instrument_id, receive_date, non_deductible_amount, total_amount, fee_amount, net_amount, trxn_id, invoice_id, currency, cancel_date, cancel_reason, receipt_date, thankyou_date, source, amount_level, contribution_recur_id, is_test, is_pay_later, contribution_status_id, note, address_id, check_number, campaign_id, tax_amount, creditnote_id, log_conn_id, log_user_id, log_action)
VALUES ( NEW.id, NEW.contact_id, NEW.financial_type_id, NEW.contribution_page_id, NEW.payment_instrument_id, NEW.receive_date, NEW.non_deductible_amount, NEW.total_amount, NEW.fee_amount, NEW.net_amount, NEW.trxn_id, NEW.invoice_id, NEW.currency, NEW.cancel_date, NEW.cancel_reason, NEW.receipt_date, NEW.thankyou_date, NEW.source, NEW.amount_level, NEW.contribution_recur_id, NEW.is_test, NEW.is_pay_later, NEW.contribution_status_id, NEW.note, NEW.address_id, NEW.check_number, NEW.campaign_id, NEW.tax_amount, NEW.creditnote_id,
CONNECTION_ID(), @civicrm_user_id, 'insert'
);
END IF;
END;//
delimiter ;
Progress status- this is on on staging & seems OK but my reason for not putting to done is that I believe that the wmf_ tables were skipped (didn't begin with civicrm_ )
Change 280157 had a related patch set uploaded (by Eileen):
CRM-18212 allow triggers to be managed offline.
I've brought this into review as there are some patches that need review - ie https://gerrit.wikimedia.org/r/#/c/280156/2
I'm trying to juggle the order a bit as there are some in the drupal repo & civi repos
However, I still have some things to do on it too
Change 280336 had a related patch set uploaded (by Eileen):
Enable setting to specify logging is handled offline
I'm moving this to done and setting it to resolved
Queue performance are turning on logging was not substantially worse -
Before tests:
2016-02-24 22:02:10 Starting donation queue performance profile for 20 donations.
Context is : Staging:
2016-02-24 22:02:14 Donation queue performance profile took 4 second(s) for 20 donations.
Context is : Staging:
Average performance is 300 per minute
2016-02-24 22:02:29 Starting donation queue performance profile for 200 donations.
Context is : Staging:
2016-02-24 22:03:05 Donation queue performance profile took 36 second(s) for 200 donations.
Context is : Staging:
Average performance is 333 per minute
2016-02-24 22:03:28 Starting donation queue performance profile for 2000 donations.
Context is : Staging:
2016-02-24 22:09:46 Donation queue performance profile took 378 second(s) for 2000 donations.
Context is : Staging:
Average performance is 317 per minute
After tests:
Context is : Staging:INNODB
2016-03-16 20:24:29 Donation queue performance profile took 397 second(s) for 2000 donations.
Context is : Staging:INNODB
Average performance is 302 per minute.
2016-03-16 20:26:45 Starting donation queue performance profile for 2000 donations.
Context is : Staging:INNODB
2016-03-16 20:33:15 Donation queue performance profile took 390 second(s) for 2000 donations.
Context is : Staging:INNODB
Average performance is 308 per minute.
Change 280336 merged by jenkins-bot:
Enable setting to specify logging is handled offline
I did some more tests using this on staging on 4.7.31 - I got times between 345 per second & 411 per second over 1000 records- with the main difference being that the first run is a bit slower - presumably as mysql caching helps on later runs
Recording results after mysql is 'warmed up'
With this patch https://github.com/civicrm/civicrm-core/pull/11615
WD wmf_civicrm: Successfully updated contribution_tracking for 52086106 [300.56 sec, 102.68 MB] [info]
Processed 2000 messages in 298 second(s)
Average performance is 403 per minute.
Command dispatch complete [300.57 sec, 95.11 MB] [notice]
Timer Cum (sec) Count Avg (msec)
page 300.523 1 300522.9
Without the patch
WD wmf_civicrm: Successfully updated contribution_tracking for 52088106 [297.6 sec, 118 MB] [info]
Processed 2000 messages in 295 second(s)
Average performance is 407 per minute.
Command dispatch complete [297.61 sec, 110.42 MB] [notice]
Timer Cum (sec) Count Avg (msec)
page 297.56 1 297560.32
The speed difference is well within the normal variation between runs but the memory usage difference was consistent & replicable.
Summary - no performance degradation with this & less memory use