Page MenuHomePhabricator

CargoBackLinks::setBackLinks causes deadlock on innoDB engine
Open, Needs TriagePublicBUG REPORT

Description

List of steps to reproduce (step by step, including full links if applicable):

  • we use the mw javascript api to lazy-load expensive template calls that contain {{#cargo_query}} calls (see below)
  • i dont know how to reproduce the issue with a minimal code example, because i dont know how cargo backlinks work
  • the js example below works fine and does not cause the error; when i debug CargoBackLinks::setBackLinks, the resultsPageIds param is an empty array
  • the js example was tested on a WikiPage in the main namespace with wikitext Blabla

What happens?:
the fastest xhr request suceeds, all remaining fail with deadlock errors

What should have happened instead?:
all xhr requests suceed

stacktrace

2022-05-12 10:13:21 webmo webmo_webmo: [385926f1d9b1cfdff571c080] /api.php   Wikimedia\Rdbms\DBQueryError: Error 1213: Deadlock found when trying to get lock; try restarting transaction (db:3306)
Function: Wikimedia\Rdbms\Database::insert                                                                                            
Query: INSERT INTO `cargo_backlinks` (cbl_query_page_id,cbl_result_page_id) VALUES (305,'318')                                
                                                                                                                                                                                                                
#0 /var/www/mediawiki/includes/libs/rdbms/database/Database.php(1590): Wikimedia\Rdbms\Database->getQueryException()                                                                                            
#1 /var/www/mediawiki/includes/libs/rdbms/database/Database.php(1564): Wikimedia\Rdbms\Database->getQueryExceptionAndLog()                      
#2 /var/www/mediawiki/includes/libs/rdbms/database/Database.php(1173): Wikimedia\Rdbms\Database->reportQueryError()                             
#3 /var/www/mediawiki/includes/libs/rdbms/database/Database.php(2352): Wikimedia\Rdbms\Database->query()                                                                                                        
#4 /var/www/mediawiki/includes/libs/rdbms/database/Database.php(2332): Wikimedia\Rdbms\Database->doInsert()                   
#5 /var/www/mediawiki/includes/libs/rdbms/database/DBConnRef.php(69): Wikimedia\Rdbms\Database->insert()                                                                                                        
#6 /var/www/mediawiki/includes/libs/rdbms/database/DBConnRef.php(375): Wikimedia\Rdbms\DBConnRef->__call()          
#7 /var/www/mediawiki/extensions/Cargo/includes/CargoBackLinks.php(50): Wikimedia\Rdbms\DBConnRef->insert()                                     
#8 /var/www/mediawiki/extensions/Cargo/includes/parserfunctions/CargoQuery.php(176): CargoBackLinks::setBackLinks()                     
#9 /var/www/mediawiki/includes/parser/Parser.php(3413): CargoQuery::run()                                                                                                                                       
#10 /var/www/mediawiki/includes/parser/Parser.php(3096): Parser->callParserFunction()   
...

js example

function parse(parseString) {
    var api = new mw.Api();
	api.parse(parseString, { title: mw.config.values.wgTitle })
	.done( function (data) {
             console.log(data);
        })
        .fail( function(err) {
            console.error(err);
        });
}
var parseString = "{{#cargo_query:tables=my_table}}"

parse(parseString);
parse(parseString);
parse(parseString);

Event Timeline

Schtom updated the task description. (Show Details)

This isn't a true fix, but please note that you can delete the cargo_backlinks table, and Cargo will still work (and presumably, you won't get these deadlock errors).

that is not really an option for us, as we have dozens of MediaWiki instances that we provide w deployments and also run update.php for. i think the deleted cargo_backlinks table would be recreated after update.php was run?

do you think a sort of retry-wrapper would make sense? the mysql docs state that as an option: https://dev.mysql.com/doc/refman/8.0/en/innodb-deadlocks-handling.html

We are also observing this issue, someone else observed and reported it here
https://www.mediawiki.org/wiki/Extension_talk:Cargo#Database_query_error_-_Error_1213:_Deadlock_found_when_trying_to_get_lock

In our case, our deadlock somehow invovles with two wikis that uses/shares cargo templates

When viewing wiki, somehow wiki_zh's table is deadlocking with it

mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s)
MySQL thread id 14622994, OS thread handle 70370005110880, query id 639459627 10.0.0.245 wiki_admin updating
DELETE /* CargoBackLinks::removeBackLinks  */ FROM `cargo_backlinks` WHERE cbl_query_page_id = 25924

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 7371 page no 9 n bits 640 index PRIMARY of table `wiki_zh`.`cargo_backlinks` trx id 74040591 lock_mode X waiting
Record lock, heap no 527 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 4; hex 00006544; asc   eD;;
 1: len 4; hex 00002ab1; asc   * ;;
 2: len 6; hex 00000469c50e; asc    i  ;;
 3: len 7; hex 02000003670306; asc     g  ;;


*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 7371 page no 9 n bits 640 index PRIMARY of table `wiki_zh`.`cargo_backlinks` trx id 74040591 lock_mode X waiting
Record lock, heap no 527 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 4; hex 00006544; asc   eD;;
 1: len 4; hex 00002ab1; asc   * ;;
 2: len 6; hex 00000469c50e; asc    i  ;;
 3: len 7; hex 02000003670306; asc     g  ;;


*** (2) TRANSACTION:
TRANSACTION 74040590, ACTIVE 3 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 5 row lock(s), undo log entries 3
MySQL thread id 14622996, OS thread handle 70373334831200, query id 639459671 10.0.0.245 wiki_admin update
INSERT /* Wikimedia\Rdbms\Database::insert  */ INTO `cargo_backlinks` (cbl_query_page_id,cbl_result_page_id) VALUES (25924,'1295')

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 7371 page no 9 n bits 640 index PRIMARY of table `wiki_zh`.`cargo_backlinks` trx id 74040590 lock_mode X
Record lock, heap no 527 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 4; hex 00006544; asc   eD;;
 1: len 4; hex 00002ab1; asc   * ;;
 2: len 6; hex 00000469c50e; asc    i  ;;
 3: len 7; hex 02000003670306; asc     g  ;;


*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 7371 page no 9 n bits 640 index PRIMARY of table `wiki_zh`.`cargo_backlinks` trx id 74040590 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 527 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 4; hex 00006544; asc   eD;;
 1: len 4; hex 00002ab1; asc   * ;;
 2: len 6; hex 00000469c50e; asc    i  ;;
 3: len 7; hex 02000003670306; asc     g  ;;

In my case it happened with CargoBackLinks::removeBackLinks(), and the only way I could bypass it is by setting $wgJobRunRate = 0 and then re-running through runJobs.php.

I've been getting this issue too.

Would it be worth adding a config flag to disable the use of Cargo backlinks? It's main use is for the SemanticDependencyUpdater extension isn't it?

AFAIK, SemanticDependencyUpdater is only for SMW; cargo_backlinks is Cargo's own dependency updater, and without it no results will be refreshed on edits/deletes.

Right, it's unrelated to SemanticDependencyUpdater, although this feature does for Cargo essentially does what SDU does for SMW. Anyway, I think it's a good idea to add a setting to disable the "backlinks" feature - which essentially would just mean not creating it when the admin calls update.php. (The code already works fine when the cargo_backlinks DB table doesn't exist, so admins can just delete it - the issue is just that it gets re-created whenever update.php is called.) Ideally this would not be necessary, but clearly there are still some bugs left in the implementation.

Okay - I just added the setting $wgCargoIgnoreBacklinks to the latest Cargo code. If you add the following to LocalSettings.php (after the inclusion of Cargo), these problems should go away:

$wgCargoIgnoreBacklinks = true;

Hopefully this is just a temporary setting, until the remaining DB issues are resolved.

I forgot to link the change to this Phabricator task, but here it is:

https://phabricator.wikimedia.org/rECRG66952daf0329c3c16ee32892a905d2b861efa8c3

Is this still an issue? Change a8f0de6382f5, from May 2023, might have fixed it.

This comment was removed by KloudZ.

Is this still an issue? Change a8f0de6382f5, from May 2023, might have fixed it.

Error: the string "SELECT" cannot be used within #cargo_query.

Still an issue to me. I updated it to 3.4.3 and there comes error message. In addition to adding $wgCargoIgnoreBacklinks = true;, I also need to delete _backlinks table completely in database to make the error message go away.

My where clause is sort of like this:

|where=<!--不显示排除年级组的词汇-->Vocabulary.nocohort HOLDS NOT '{{#var:cohort|}}' AND Vocabulary.book='{{#var:book}}' AND Vocabulary.unit='{{#var:unit}}' AND (Vocabulary.translation !='' OR Vocabulary.translation IS NOT NULL) AND Vocabulary.wordc HOLDS NOT 'phrase' AND Vocabulary.wordc HOLDS NOT 'collocation' AND Vocabulary.wordc HOLDS NOT 'abbr.' AND (Vocabulary.nospelling !='Yes' OR Vocabulary.nospelling IS NULL) AND (Vocabulary.wordtype='基本词汇'  <!--只显示匹配基本词汇的拓展词汇-->{{#if:{{#arrayprint:xvocid}}|OR Vocabulary._ID IN ({{#arrayprint:xvocid}})}})

Still an issue for me on Cargo 3.5.1. Enabling the new flag remedied our endless page-save/timeout issue. The issue emerged for us on switching from php 8.0 to 8.1.