Digging into the merge hook I found that loading the merge screen with just 2 pre-selected contacts was slow.
It turns out the slow query (15-17 seconds) was new in 4.5 & it is
SELECT
table_name, column_name
FROM information_schema.key_column_usage
WHERE
referenced_table_schema = database() AND referenced_table_name = 'civicrm_contact' AND referenced_column_name = 'id';
It was introduced in https://issues.civicrm.org/jira/browse/CRM-14500 in order to make the selection of which tables to merge dynamic.
I did some digging into it and found that
- the query will theoretically run faster with this mysql setting innodb_stats_on_metadata = 0; per https://www.percona.com/blog/2011/12/23/solving-information_schema-slowness/
I don't have permission to try that on staging but tested on other servers & locally - I got a semi-consistent 10% speed saving on this query so it would be worth trying on staging but it's not the main bottleneck.
- I did further testing on my local and on other servers I had access to & found the most significant factor in the speed is the total number of tables the mysql user has permission to see. Currently on staging the mysql user has permission to see 2 flavours of dev db - ie. probably double the number of tables the production user would have permission to - so depending how exponential the effect is this may not turn out to be an issue on production.
The query does not only access tables in the current database even if that is provided as a parameter to the query unless both schema & table are defined in the query (which rather defeats the purpose)
I will log this on civicrm.org too - for visibility - but we need to confirm how the query performs on prod before prioritising it.
In most situations the values derived from the query would only change on upgrade, on adding a custom table and on on module installation so it would make sense to cache this information in some way & only update on cache-flush