Working on T216185, I chose to use the PrefUpdate EventLogging tables in the log database as they provide data going back the couple of years that we're looking for. In doing so, I ran into the problem that the timestamp column in PrefUpdate_5563398 has collation defined, whereas the one in PrefUpdate_5563398_15423246 does not, as shown in the SHOW CREATE TABLE output below (condensed for brevity):
CREATE TABLE `PrefUpdate_5563398` ( ... `timestamp` varchar(14) COLLATE utf8_unicode_ci NOT NULL, ... ) ENGINE=TokuDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci `compression`='tokudb_zlib'
CREATE TABLE `PrefUpdate_5563398_15423246` ( ... `timestamp` varchar(14) DEFAULT NULL, ... ) ENGINE=TokuDB AUTO_INCREMENT=68969415 DEFAULT CHARSET=utf8 `compression`='tokudb_zlib'
This results in the following unexpected result when trying to naively combine data from both tables:
SELECT * FROM ( (SELECT timestamp FROM PrefUpdate_5563398_15423246 WHERE timestamp >= '20160101000000' AND timestamp < '20160201000000') UNION ALL (SELECT timestamp FROM PrefUpdate_5563398 WHERE timestamp >= '20180101000000' AND timestamp < '20180201000000') ) AS pu LIMIT 5; ERROR 1271 (HY000): Illegal mix of collations for operation 'UNION'
One solution to the problem is to force collation for the non-collated column by changing the first sub-select so it starts with: SELECT timestamp COLLATE utf8_unicode_ci.
I have not investigated whether other EventLogging schema tables in the log database is also affected by this.