Page MenuHomePhabricator

CI failure from non-fatal SqlModuleDependencyStore "database is locked" warning
Closed, ResolvedPublic

Description

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/768705/

@Krinkle wrote:

I assume this is a consequence of T225730, which started running some steps in parallel and more recently also enabled concurrent requests during QUnit tests (by using Apache) which increases likelihood of these queries happening at the "wrong" time.

Having said that, the issue is genuine and should be fixed. Note that this is not a fatal error. The reason it resulted in a failed build is that after all tests were done (and passing), we check the mw-dberror.log file and we logged this recoverable error there. As such, I imagine the fix to be to make this more tolerant to races and make it not produce an error-level log message.

mw-dberror.log
Error 5 from Wikimedia\DependencyStore\SqlModuleDependencyStore::storeMulti, database is locked UPDATE  module_deps SET md_deps = '["extensions/VisualEditor/lib/ve/src/ce/styles/../../ui/styles/images/unchecked.svg"]' WHERE (md_module = 'ext.visualEditor.checkList' AND md_skin = 'fallback|qqx') [Null]

Wikimedia\Rdbms\Database->upsert(string, array, array, array, string)
SqlModuleDependencyStore->storeMulti(string, array, integer)
ResourceLoader->{closure}()
DeferredUpdates::run(MWCallableUpdate, ..)
..
DeferredUpdates::doUpdates()
MediaWiki->restInPeace()
MediaWiki->doPostOutputShutdown()
/workspace/src/load.php(38): wfLoadMain()
mw-error.log
/load.php?modules=ext.visualEditor...%7Cext.visualEditor.mwimage.core&version=1g2nn
Wikimedia\DependencyStore\DependencyStoreException: Error 5: database is locked

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The error initially had me confused as it seems impossible to cause a lock on this row given we have an external lock on this in the calling code already before we get to the db query.

Then I realised its a "database lock", meaning the entire database is locked. I'm not aware of a way to trigger this under normal conditions in MySQL.

Then I realised this failure was from the sqlite build, not the mysql one.

https://integration.wikimedia.org/ci/job/quibble-vendor-sqlite-php72-noselenium-docker/1396/console

So the issue then is that the db file is locked. That makes slightly more sense given we're doing concurrent write operations from multiple requests, including with multiple connections from the same request given SqlModuleDepStore uses its own independent autocommit primary-db connection handle.

I think use of sqlite and apache should be considered mutually exclusive. It was probably turned on for Selenium without the realization that this has an sqlite option. And more generally it seems like the way we manage sqlite connections currently, these are not meant to handle concurrency anyway so it'd likely not be notably faster without also being more unstable.

SqliteInstaller should probably set $wgMainStash to CACHE_DB since it already generates a separate DB with optimized settings (this error should be unlikely with the default PDO lock timeout)

We're encountering this with SqlModuleDependencyStore which is given the main LoadBalancer from the LBFactory service, not SqlBagOStuff/MainStash.

Change 769138 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] Use CACHE_DB for $wgMainCache in SqliteInstaller

https://gerrit.wikimedia.org/r/769138

Krinkle triaged this task as Medium priority.

Change 769138 merged by jenkins-bot:

[mediawiki/core@master] installer: make 'db-replicated' an alias for CACHE_DB for sqlite

https://gerrit.wikimedia.org/r/769138