Page MenuHomePhabricator

Lock wait timeout exceeded in GlobalUsage::insertLinks
Closed, ResolvedPublicPRODUCTION ERROR

Description

1205 Lock wait timeout exceeded; try restarting transaction (10.0.6.41) INSERT INTO globalimagelinks (gil_wiki,gil_page,gil_page_namespace_id,gil_page_namespace,gil_page_title,gil_to) VALUES ('enwikisource','948233','0','','Abd_Al_Rahim_Abdul_Rassak_Janko','Abd_al-Rahim-1.ogg'),('enwikisource','948233','0','','Abd_Al_Rahim_Abdul_Rassak_Janko','FBI_Abd-al-Rahim.jpg')


Version: unspecified
Severity: normal

Details

Reference
bz28600

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:31 PM
bzimport added a project: GlobalUsage.
bzimport set Reference to bz28600.

Abigor wrote:

Currently when I try to import the content to ar.wikiversity.org I keep hitting this error:

Import is mislukt: A database error has occurred. Did you forget to run maintenance/update.php after upgrading? See: http://www.mediawiki.org/wiki/Manual:Upgrading#Run_the_update_script Query: INSERT INTO globalimagelinks (gil_wiki,gil_page,gil_page_namespace_id,gil_page_namespace,gil_page_title,gil_to) VALUES ('arwikiversity','1823','10','قالب','GFDL','Heckert_GNU_white.svg'),('arwikiversity','1823','10','قالب','GFDL','GFDL') Function: GlobalUsage::insertLinks Error: 1062 Duplicate entry 'Heckert_GNU_white.svg-arwikiversity-1823' for key 'PRIMARY' (10.0.6.41)

This is causing serieus problems for starting the project.

I had this error also 2 times for xmf.wikipedia.org

Abigor wrote:

The xml is here: http://evakrap.nl/wikimedia/

Since I cant upload it because its to big.

Bryan.TongMinh wrote:

I don't understand the cause of this. The current procedure is this:

  1. Fetch links already in globalimagelinks
  2. Determine which links are not yet in there
  3. Insert links not yet in globalimagelinks in the database.

Somehow between 1. and 3. an entry gets added to the database.

Doing INSERT IGNORE will probably solve the symptoms in both #1 and #2, but hide the actual issue. But I think that is not really bad for those kind of race conditions.

(In reply to comment #3)

I don't understand the cause of this. The current procedure is this:

  1. Fetch links already in globalimagelinks
  2. Determine which links are not yet in there
  3. Insert links not yet in globalimagelinks in the database.

Somehow between 1. and 3. an entry gets added to the database.

Doing INSERT IGNORE will probably solve the symptoms in both #1 and #2, but
hide the actual issue. But I think that is not really bad for those kind of
race conditions.

It's almost like we've got a stray caller that is pushing it doubley. Maybe.

Bryan.TongMinh wrote:

Should be fixed in r92241.

Using action=purge&forcelinkupdate= gives sometimes:

internal_api_error_DBQueryError=Database query error

#0 /usr/local/apache/common-local/php-1.19/includes/db/Database.php(887): DatabaseBase->reportQueryError('Lock wait timeo...', 1205, 'INSERT IGNORE I...', 'GlobalUsage::in...', false)
#1 /usr/local/apache/common-local/php-1.19/includes/db/Database.php(1688): DatabaseBase->query('INSERT IGNORE I...', 'GlobalUsage::in...')
#2 /usr/local/apache/common-local/php-1.19/extensions/GlobalUsage/GlobalUsage_body.php(40): DatabaseBase->insert('globalimagelink...', Array, 'GlobalUsage::in...', Array)
#3 /usr/local/apache/common-local/php-1.19/extensions/GlobalUsage/GlobalUsageHooks.php(51): GlobalUsage->insertLinks(Object(Title), Array)
#4 [internal function]: GlobalUsageHooks::onLinksUpdateComplete(Object(LinksUpdate))
#5 /usr/local/apache/common-local/php-1.19/includes/Hooks.php(216): call_user_func_array('GlobalUsageHook...', Array)
#6 /usr/local/apache/common-local/php-1.19/includes/GlobalFunctions.php(3794): Hooks::run('LinksUpdateComp...', Array)
#7 /usr/local/apache/common-local/php-1.19/includes/LinksUpdate.php(114): wfRunHooks('LinksUpdateComp...', Array)
#8 /usr/local/apache/common-local/php-1.19/includes/api/ApiPurge.php(97): LinksUpdate->doUpdate()
#9 /usr/local/apache/common-local/php-1.19/includes/api/ApiMain.php(706): ApiPurge->execute()
#10 /usr/local/apache/common-local/php-1.19/includes/api/ApiMain.php(360): ApiMain->executeAction()
#11 /usr/local/apache/common-local/php-1.19/includes/api/ApiMain.php(344): ApiMain->executeActionWithErrorHandling()
#12 /usr/local/apache/common-local/php-1.19/api.php(117): ApiMain->execute()
#13 /usr/local/apache/common-local/live-1.5/api.php(3): require('/usr/local/apac...')
#14 {main}

  • Bug 42021 has been marked as a duplicate of this bug. ***
  • Bug 45148 has been marked as a duplicate of this bug. ***

The only operations done on globalimagelinks are SELECTs with no lock modifier, INSERTs and DELETEs. AFAIK (which is not very far) InnoDB SELECTs never lock and INSERT/DELETE only lock the affected rows and the gap before them (or around, for DELETE) so the only way an INSERT (which affects the last gap) can be blocked is when the last record in the table gets deleted and the transaction doing that deletion runs very long (the current lock wait timeout is 15s).

That would mean the locking query has to be GlobalUsage::deleteLinksFromPage() or GlobalUsage::deleteLinksToFile but both of those have covering indexes so they should be fast. Maybe this happens when someone deletes an image that's used on a huge number of pages?

@demon, is this something that happens frequently? I looked at fluorine but the only lock wait timeout error I can see is something about translate_groupstats.

I don't think it happens frequently, no.

It is not happening frequently, but it is creating log spam here and there. Example it just happened:

{
  "_index": "logstash-2016.01.28",
  "_type": "mediawiki",
  "_id": "AVKIeOrsptxhN1Xa1dNc",
  "_score": null,
  "_source": {
    "message": "GlobalUsage::deleteLinksFromPage\t10.64.16.29\t1205\tLock wait timeout exceeded; try restarting transaction (10.64.16.29)\tDELETE FROM `globalimagelinks` WHERE gil_wiki = 'ruwikisource' AND gil_page = '186681' AND gil_to IN ('Brockhaus_Lexikon.jpg','Fictional_page.svg','Wikipedia-logo.png') ",
    "@version": 1,
    "@timestamp": "2016-01-28T13:42:51.000Z",
    "type": "mediawiki",
    "host": "mw1114",
    "level": "ERROR",
    "tags": [
      "syslog",
      "es",
      "es",
      "normalized_message_trimmed"
    ],
    "channel": "wfLogDBError",
    "url": "/w/api.php?action=purge&format=xml",
    "ip": "10.64.32.105",
    "http_method": "POST",
    "server": "ru.wikisource.org",
    "referrer": null,
    "uid": "b47f92a",
    "process_id": 16834,
    "wiki": "ruwikisource",
    "db_server": "10.64.16.29",
    "db_name": "commonswiki",
    "db_user": "wikiuser",
    "method": "DatabaseBase::reportQueryError",
    "errno": 1205,
    "error": "Lock wait timeout exceeded; try restarting transaction (10.64.16.29)",
    "sql1line": "DELETE FROM `globalimagelinks` WHERE gil_wiki = 'ruwikisource' AND gil_page = '186681' AND gil_to IN ('Brockhaus_Lexikon.jpg','Fictional_page.svg','Wikipedia-logo.png') ",
    "fname": "GlobalUsage::deleteLinksFromPage",
    "normalized_message": "GlobalUsage::deleteLinksFromPage\t10.64.16.29\t1205\tLock wait timeout exceeded; try restarting transaction (10.64.16.29)\tDELETE FROM `globalimagelinks` WHERE gil_wiki = 'ruwikisource' AND gil_page = '186681' AND gil_to IN ('Brockhaus_Lexikon.jpg','Fictional"
  },
  "sort": [
    1453988571000
  ]
}
Krinkle claimed this task.
Krinkle subscribed.

Not seen in the last 30 days.

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:12 PM