Page MenuHomePhabricator

Wikibase\Repo\Store\Sql\SqlIdGenerator::generateNewId deadlock on wb_id_counters when running selenium tests in parallel
Closed, ResolvedPublic

Description

The webdriver.io Selenium tests are run serially. As part of T226869 I have made a change to run up to 4 tests in parallel with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/545650 which has failures.

repo/tests/selenium/specs/item.js
Error in "item.can add a statement using the keyboard"
Error: internal_api_error_DBQueryError: [3b6a7dfcb49300fd1f95a0e6] Exception caught: A database query error has occurred. This may indicate a bug in the software.

There is db error attached to the CI build:

Wed Jan 5 20:01:59 UTC 2022	030379ee3acc	wikidb	Error 1213 from Wikibase\Repo\Store\Sql\SqlIdGenerator::generateNewId, Deadlock found when trying to get lock; try restarting transaction (localhost:/workspace/db/quibble-mysql-fkbo30y1/socket) INSERT INTO `wb_id_counters` (id_value,id_type) VALUES (1,'wikibase-item') localhost:/workspace/db/quibble-mysql-fkbo30y1/socket
#0 /workspace/src/includes/libs/rdbms/database/Database.php(1758): Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer, string, string)
#1 /workspace/src/includes/libs/rdbms/database/Database.php(1302): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /workspace/src/includes/libs/rdbms/database/Database.php(2530): Wikimedia\Rdbms\Database->query(string, string, integer)
#3 /workspace/src/includes/libs/rdbms/database/Database.php(2510): Wikimedia\Rdbms\Database->doInsert(string, array, string)
#4 /workspace/src/extensions/Wikibase/repo/includes/Store/Sql/SqlIdGenerator.php(103): Wikimedia\Rdbms\Database->insert(string, array, string)
#5 /workspace/src/extensions/Wikibase/repo/includes/Store/Sql/SqlIdGenerator.php(59): Wikibase\Repo\Store\Sql\SqlIdGenerator->generateNewId(Wikimedia\Rdbms\DatabaseMysqli, string)
#6 /workspace/src/extensions/Wikibase/repo/includes/Store/RateLimitingIdGenerator.php(37): Wikibase\Repo\Store\Sql\SqlIdGenerator->getNewId(string)
#7 /workspace/src/extensions/Wikibase/repo/includes/Store/Sql/WikiPageEntityStore.php(166): Wikibase\Repo\Store\RateLimitingIdGenerator->getNewId(string)
#8 /workspace/src/extensions/Wikibase/lib/includes/Store/TypeDispatchingEntityStore.php(59): Wikibase\Repo\Store\Sql\WikiPageEntityStore->assignFreshId(Wikibase\DataModel\Entity\Item)
#9 /workspace/src/extensions/Wikibase/repo/includes/Api/EntitySavingHelper.php(298): Wikibase\Lib\Store\TypeDispatchingEntityStore->assignFreshId(Wikibase\DataModel\Entity\Item)
#10 /workspace/src/extensions/Wikibase/repo/includes/Api/EntitySavingHelper.php(228): Wikibase\Repo\Api\EntitySavingHelper->createEntity(string, NULL, string)
#11 /workspace/src/extensions/Wikibase/repo/includes/Api/ModifyEntity.php(381): Wikibase\Repo\Api\EntitySavingHelper->loadEntity(array, NULL, string)
#12 /workspace/src/extensions/Wikibase/repo/includes/Api/ModifyEntity.php(305): Wikibase\Repo\Api\ModifyEntity->loadEntityFromSavingHelper(NULL)
#13 /workspace/src/includes/api/ApiMain.php(1889): Wikibase\Repo\Api\ModifyEntity->execute()
#14 /workspace/src/includes/api/ApiMain.php(868): ApiMain->executeAction()
#15 /workspace/src/includes/api/ApiMain.php(839): ApiMain->executeActionWithErrorHandling()
#16 /workspace/src/api.php(90): ApiMain->execute()
#17 /workspace/src/api.php(45): wfApiMain()
#18 {main}

This can be reproduced by setting wdio maxInstances to 4:

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/545650

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Maybe related is T194299, there are mentions of $wgWBRepoSettings['idGenerator'] = 'mysql-upsert', I don't know what the default is or what is set on the CI job.

The comment T194299#4840260 has the same deadlock on INSERT then there are tasks to use a separate database connection for the id generator: T213817 and T215147.

I am guessing the CI config would require a similar adjustment?

\Wikibase\Repo\Store\Sql\SqlIdGenerator definitely looks prone to deadlocks. It should probably work more like TableNameStore (named locks + auto-commit trx).

I am guessing the CI config would require a similar adjustment?

Yes, we might want to set

$wgWBRepoSettings['idGenerator'] = 'mysql-upsert';

However, that’s MySQL/MariaDB-specific (as the value implies), and I don’t know if we support / plan to support running Wikibase CI on other RDBMSes, so I’m not sure if this belongs into Wikibase’s repo/config/Wikibase.ci.php or should be somewhere else.

Or maybe the default config should be that Wikibase autodetects whether MySQL is used or not? (In production I think we’d still want to explicitly configure this particular implementation.) The UpsertSqlIdGenerator is well tested in production by this point, and I’m not aware of any reason not to use it if you’re on MySQL.

It looks like this is now sometimes happening even with non-parallel builds (unless you changed something so they’re parallel by default?), e.g. in #96773 for change 731277.

Change 754495 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Support idGenerator=auto and use it in CI

https://gerrit.wikimedia.org/r/754495

I went with a compromise between the two suggestions, introducing an auto type but only configuring it in CI and not yet making it the default.

Change 754500 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/core@master] Add ConnectionManager::getLazyWriteConnectionRef()

https://gerrit.wikimedia.org/r/754500

Change 754500 merged by jenkins-bot:

[mediawiki/core@master] Add ConnectionManager::getLazyWriteConnectionRef()

https://gerrit.wikimedia.org/r/754500

Change 754495 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Support idGenerator=auto and use it in CI

https://gerrit.wikimedia.org/r/754495

Alright, let’s see if these errors happen less often now.

Wikibase CI is looking much better as far as I can tell; https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/545650 still has errors after rebasing, but they’re different ones now. I think we can close this particular task as done.