Page MenuHomePhabricator

Wikimedia\Rdbms\DBTransactionSizeError trying to delete a file, exceeding the 3s limit
Open, Needs TriagePublicPRODUCTION ERROR

Description

I got

[0ada691f-519e-46a9-8b11-ff64ea27dc06] 2023-02-06 12:09:15: Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError"

while trying to delete https://commons.wikimedia.org/wiki/File:John_And_Psalms.pdf

from /srv/mediawiki/php-1.40.0-wmf.21/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1594)
#0 /srv/mediawiki/php-1.40.0-wmf.21/includes/libs/rdbms/lbfactory/LBFactory.php(387): Wikimedia\Rdbms\LoadBalancer->approvePrimaryChanges(integer, string)
#1 /srv/mediawiki/php-1.40.0-wmf.21/includes/MediaWiki.php(679): Wikimedia\Rdbms\LBFactory->commitPrimaryChanges(string, integer)
#2 /srv/mediawiki/php-1.40.0-wmf.21/includes/MediaWiki.php(649): MediaWiki::preOutputCommit(RequestContext)
#3 /srv/mediawiki/php-1.40.0-wmf.21/includes/MediaWiki.php(928): MediaWiki->doPreOutputCommit()
#4 /srv/mediawiki/php-1.40.0-wmf.21/includes/MediaWiki.php(571): MediaWiki->main()
#5 /srv/mediawiki/php-1.40.0-wmf.21/index.php(50): MediaWiki->run()
#6 /srv/mediawiki/php-1.40.0-wmf.21/index.php(46): wfIndexMain()
#7 /srv/mediawiki/w/index.php(3): require(string)
#8 {main}

/w/index.php?action=delete&title=File:JohnAndPsalms.pdf Wikimedia\Rdbms\DBTransactionSizeError: Transaction spent 3.416s in writes, exceeding the 3s limit

Details

MediaWiki Version
1.40.0-wmf.21

Event Timeline

Aklapper renamed this task from Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError" while trying deleting a file to Wikimedia\Rdbms\DBTransactionSizeError trying to delete a file, exceeding the 3s limit.Feb 6 2023, 8:44 PM
Aklapper changed the subtype of this task from "Bug Report" to "Production Error".
Aklapper updated the task description. (Show Details)
Aklapper set Release Version to 1.40.0-wmf.21.

DBTransactionSizeError is only thrown in one situation as far as I can tell, and that is when the commit takes too long (the time to write the change to the database takes too long).

@Ladsgroup any idea what is happening here ? Is the query too complex ? Or were there DB problems last monday ?

Hi, I'm sorry for late response. I was sick.

This is a touch one. If it's happening consistently for this particular image, then it's an issue.
It can be one of many things:

  • The deleter removes the row from image table and adds them to filearchive tables, if it has too large blobs such as djvu/pdf metadata, it can be the reason.
  • The more likely culprit is that probably the db writes are not happening at the same time. Let me explain a bit. If a write is triggered, mw starts a transaction (and the timer) and wraps everything around this transaction (it's called implicit transaction) and if one fails, everything gets rolled back. My guess is that something writes to the db, then parts of mw tries to do something complicated before doing the rest of the writes expanding the transaction time and making it timeout.

I can figure out what is exactly happening here if you can confirm that this deletion is happening consistently (so I can reproduce the error)