Page MenuHomePhabricator

MW DB query error handling change causes Cargo to break
Closed, ResolvedPublic

Description

I'm running MediaWiki 1.31.0-rc.0 (23b8347) and Cargo at 5f40367 (REL1_31 branch).
MariaDB 10.1.26 is used as the database system and Cargo is using a separate database.

On saving pages the following exception is thrown, causing Cargo not to update any data:

[exception] [6a0075334fa02481d7970820] /index.php?title=Admin:Farnsworth&action=submit   Wikimedia\Rdbms\DBUnexpectedError from line 916 of /var/www/wiki/includes/libs/rdbms/database/Database.php: Wikimedia\Rdbms\Database::close: mass commit/rollback of peer transaction required (DBO_TRX set).
#0 /var/www/wiki/extensions/Cargo/Cargo.hooks.php(186): Wikimedia\Rdbms\Database->close()
#1 /var/www/wiki/extensions/Cargo/Cargo.hooks.php(214): CargoHooks::deletePageFromSystem(integer)
#2 /var/www/wiki/includes/Hooks.php(177): CargoHooks::onPageContentSaveComplete(WikiPage, User, WikitextContent, string, integer, NULL, NULL, integer, Revision, Status, boolean, integer)
#3 /var/www/wiki/includes/Hooks.php(205): Hooks::callHook(string, array, array, NULL)
#4 /var/www/wiki/includes/page/WikiPage.php(1853): Hooks::run(string, array)
#5 [internal function]: WikiPage->{closure}(Wikimedia\Rdbms\DatabaseMysqli, string)
#6 /var/www/wiki/includes/libs/rdbms/database/Database.php(3664): call_user_func_array(Closure, array)
#7 /var/www/wiki/includes/deferred/AtomicSectionUpdate.php(35): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure)
#8 /var/www/wiki/includes/deferred/DeferredUpdates.php(259): AtomicSectionUpdate->doUpdate()
#9 /var/www/wiki/includes/deferred/DeferredUpdates.php(210): DeferredUpdates::runUpdate(AtomicSectionUpdate, Wikimedia\Rdbms\LBFactorySimple, string, integer)
#10 /var/www/wiki/includes/deferred/DeferredUpdates.php(127): DeferredUpdates::execute(array, string, integer)
#11 /var/www/wiki/includes/MediaWiki.php(606): DeferredUpdates::doUpdates(string, integer)
#12 /var/www/wiki/includes/MediaWiki.php(575): MediaWiki::preOutputCommit(RequestContext, Closure)
#13 /var/www/wiki/includes/MediaWiki.php(877): MediaWiki->doPreOutputCommit(Closure)
#14 /var/www/wiki/includes/MediaWiki.php(524): MediaWiki->main()
#15 /var/www/wiki/index.php(42): MediaWiki->run()
#16 {main}

Event Timeline

Here is the SQL Dump from right before the exception is thrown:

[DBQuery] wiki BEGIN /* WikiPage::doModify  */
[DBQuery] wiki SAVEPOINT /* WikiPage::doModify  */ `wikimedia_rdbms_atomic1`
WikiPage::doEditUpdates: Using prepared edit...
[caches] parser: EmptyBagOStuff
[DBQuery] wiki SELECT /* LinkCache::fetchPageRow  */  page_id,page_len,page_is_redirect,page_latest,page_content_model  FROM `page`    WHERE page_namespace = '3002' AND page_title = 'Farnsworth'  LIMIT 1
[DBQuery] wiki SELECT /* Wikimedia\Rdbms\Database::select  */  table_name  FROM `cargo_pages`    WHERE page_id = '1962'
[DBQuery] wiki SELECT /* Wikimedia\Rdbms\Database::select  */  field_tables  FROM `cargo_tables`    WHERE main_table = 'Serverinfo'
[DBQuery] wiki SELECT /* Wikimedia\Rdbms\Database::select  */  field_tables  FROM `cargo_tables`    WHERE main_table = '_pageData'
[DBQuery] wiki DELETE /* Wikimedia\Rdbms\Database::delete  */ FROM `cargo_pages` WHERE page_id = '1962'
[DBQuery] wiki ROLLBACK /* WikiPage::doModify  */ TO SAVEPOINT `wikimedia_rdbms_atomic1`
[DBQuery] wiki ROLLBACK /* MWExceptionHandler::rollbackMasterChangesAndLog  */

The problem is most likely caused by a change in MediaWiki, not in Cargo. Since there were some DB changes lately, I tried different cargo version until d23c1e8d (Version 1.6), where I am pretty sure that it still worked.

These commits seems related: https://gerrit.wikimedia.org/r/#/c/424375/ https://gerrit.wikimedia.org/r/#/c/421496/

Yaron_Koren renamed this task from Exception on page save: mass commit/rollback of peer transaction required (DBO_TRX set) to MW DB query error handling change causes Cargo to break.Apr 23 2018, 1:57 PM

@Julien.Schmidt - thanks for the great detective work, of diagnosing the issue only a few weeks after it became a problem. I'm not currently running the latest MediaWiki code, although I hope to do that soon so I can try fixing this problem as soon as possible. It looks like MediaWiki thinks there are still uncompleted transactions when Cargo starts its own transaction (by calling startAtomic()), but I don't know more than that.