Page MenuHomePhabricator

Run Quibble Mariadb server with innodb_print_all_deadlocks
Closed, ResolvedPublic

Description

When running Selenium tests concurrently they sometime choke with a deadlock on the user table.

The issue is INSERT statements share a lock on user_name index and that was found on T199393#4420174 by running a local MariaDB with innodb_print_all_deadlocks which gives a useful trace:

2018-07-12 11:17:06 7f60b4c98700InnoDB: transactions deadlock detected, dumping detailed information.
2018-07-12 11:17:06 7f60b4c98700
*** (1) TRANSACTION:
TRANSACTION 36589276, ACTIVE 21 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 205, OS thread handle 0x7f60b79d0700, query id 2571 localhost root update
INSERT IGNORE INTO `user` (user_name,user_password,user_newpassword,user_email,user_email_authenticated,user_real_name,user_token,user_registration,user_editcount,user_touched) VALUES ('T199393.153140698912290','','','',NULL,'','b0ad88565a69596f779b0eb41607cc12','20180712144949','0','20180712144954')
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1189014 page no 4 n bits 144 index `user_name` of table `mywiki`.`user` trx table locks 2 total table locks 4  trx id 36589276 lock_mode X locks gap before rec insert intention waiting lock hold time 4 wait time before grant 0
*** (2) TRANSACTION:
TRANSACTION 36589277, ACTIVE 10 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 206, OS thread handle 0x7f60b4c98700, query id 2572 localhost root update
INSERT IGNORE INTO `user` (user_name,user_password,user_newpassword,user_email,user_email_authenticated,user_real_name,user_token,user_registration,user_editcount,user_touched) VALUES ('T199393.153140698912291','','','',NULL,'','b0ad88565a69596f779b0eb41607cc12','20180712144949','0','20180712144954')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1189014 page no 4 n bits 144 index `user_name` of table `mywiki`.`user` trx table locks 2 total table locks 4  trx id 36589277 lock mode S locks gap before rec lock hold time 10 wait time before grant 0
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1189014 page no 4 n bits 144 index `user_name` of table `mywiki`.`user` trx table locks 2 total table locks 4  trx id 36589277 lock_mode X locks gap before rec insert intention waiting lock hold time 0 wait time before grant 0
*** WE ROLL BACK TRANSACTION (2)

It is unclear to whether it ends up in the MediaWiki $wgDBerrorLog or whether we should configure a MariaDB logfile pointing to a file under LOG_DIR.

MariaDB is spawned by Quibble which I think is the best place to add the option. The setting can be enabled using:

/usr/sbin/mysqld --innodb-print-all-deadlocks

There is another option:

--innodb-deadlock-report=name
How to report deadlocks (if innodb_deadlock_detect=ON)
One of: off, basic, full

Mariadb on Debian seems to default to log errors to stdout. They can be redirected to a file using:

--log-error[=name]
Log errors to file (instead of stdout). If file name is not specified then datadir/log-basename.err or the pid-file path with extension .err is used.

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedFeaturekostajh
OpenNone
OpenNone
ResolvedNone
ResolvedNone
Resolvedawight
OpenNone
Resolvedcscott
Resolvedkostajh
OpenNone
Resolvedkostajh
Resolvedhashar
OpenNone
ResolvedPRODUCTION ERRORhoo
ResolvedDreamy_Jazz
OpenNone
OpenNone
Resolvedhashar

Event Timeline

@hashar I am not really sure what is required from us here. I don't think we own this server (to be honest I don't even know on which host it is running). Can you clarify what you'd need from us?

Sorry I probably tagged you too early. For MediaWiki tests we spawn a MariaDB server on the fly and my intent is to have it print deadlocks details so they can be attached to the Jenkins build. I will test it out using mysqld --innodb-print-all-deadlocks --log-error="$LOG_DIR/mariadb-error.txt" and see whether that captures what I expect. Maybe MediaWiki can later be taught to retrieve the info if it is somehow exposed.

I will add you team back if I need some expertise :]

Change 939316 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/quibble@master] Set MySQL error log to $LOG_DIR

https://gerrit.wikimedia.org/r/939316

Change 939634 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/quibble@master] (DO NOT SUBMIT) CI Mariadb error log with checkuser

https://gerrit.wikimedia.org/r/939634

Change 939650 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/quibble@master] Print all deadlocks to MariaDB error log

https://gerrit.wikimedia.org/r/939650

Change 939316 merged by jenkins-bot:

[integration/quibble@master] Set MySQL error log to $LOG_DIR

https://gerrit.wikimedia.org/r/939316

Change 939650 merged by jenkins-bot:

[integration/quibble@master] Print all deadlocks to MariaDB error log

https://gerrit.wikimedia.org/r/939650

Change 939634 abandoned by Hashar:

[integration/quibble@master] (DO NOT SUBMIT) CI Mariadb error log with checkuser

Reason:

True, thank you ! :)

https://gerrit.wikimedia.org/r/939634

This will be in the next Quibble version after 1.5.5. The deadlock will be emitted in the MySQL error log which will be attached to the builds as log/mysql-error.log once CI jobs are switched to the new Quibble version.

Change 958952 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/quibble@master] release: Quibble 1.5.6

https://gerrit.wikimedia.org/r/958952

Change 958952 merged by jenkins-bot:

[integration/quibble@master] release: Quibble 1.5.6

https://gerrit.wikimedia.org/r/958952

Change 959804 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] dockerfiles: update Quibble to 1.5.6

https://gerrit.wikimedia.org/r/959804

Change 959804 merged by jenkins-bot:

[integration/config@master] dockerfiles: update Quibble to 1.5.6

https://gerrit.wikimedia.org/r/959804

Change 963021 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: update Quibble jobs to 1.5.6

https://gerrit.wikimedia.org/r/963021

Change 963021 merged by jenkins-bot:

[integration/config@master] jjb: update Quibble jobs to 1.5.6

https://gerrit.wikimedia.org/r/963021

I have updated all the Jenkins jobs to use Quibble 1.5.6 which now spawns MariaDB with --innodb-print-all-deadlocks. The log should be attached to the builds as log/mysql-error.log.