Page MenuHomePhabricator

Tendril activity column: trx_adaptive_hash_latched column removed on 10.2 (and onwards) from information_schema.innodb_trx causes the `*_activity` event to fail
Closed, ResolvedPublic

Description

Looks like the problem with db1114 not reporting activity on the "Act" column on tendril is due to this commit on 10.2: https://github.com/MariaDB/server/commit/99e017d099
With trx_adaptive_hash_latched column being removed on 10.3 from information_schema.innodb_trx, the event called $servername_3306_activity fails to execute.

[ERROR] Event Scheduler: [root@10.64.32.25][tendril.db1114_eqiad_wmnet_3306_activity] The foreign data source you are trying to reference does not exist. Data source error:  error: 1054  'Unknown column 'trx_adaptive_hash_latched' in 'fie

With that failure, the column event_activity of the servers table didn't get updated, and hence db1114 was showing no activity there.
The root cause seems to be this part of the trigger:

insert into innodb_trx_log   select * from innodb_trx where server_id = @server_id;
insert into innodb_trx   select @server_id, t.* from ${server}_innodb_trx t;

I have dropped db1114 (10.3) and db2112 (10.1) and re-create them ignoring those two parts of the triggers, they both worked fine.

Those two inserts populate the table $servername_3306_innodb_trx tendril table, which, as far as I know we don't really use for anything.
I had doubts if that table could be in use for the https://tendril.wikimedia.org/activity?research=0&labsusers=0 part, to show long running queries, but I have generated two long running queries on db1114 and db2112 and they got reported there with no issues.

So I think we should just comment those two inserts as they are not really currently being used for anything: https://gerrit.wikimedia.org/r/532296

Event Timeline

Change 532296 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software/tendril@master] tendril: Disable insert on innodb_trx and innodb_trx_log

https://gerrit.wikimedia.org/r/532296

Marostegui triaged this task as Medium priority.Aug 26 2019, 6:40 AM
Marostegui updated the task description. (Show Details)
Marostegui moved this task from Triage to In progress on the DBA board.

Also, I am even more forward to keep disabling stuff we don't really use anymore, to see if we can reduce the weird memory growth that makes tendril crash after X months T231165: db1115 (tendril) paged twice in 24h due to OOM

Change 532296 merged by Marostegui:
[operations/software/tendril@master] tendril: Disable insert on innodb_trx and innodb_trx_log

https://gerrit.wikimedia.org/r/532296

DannyS712 added a subscriber: DannyS712.

[batch] remove patch for review tag from resolved tasks