Page MenuHomePhabricator

DBTransactionSizeError: Transaction spent {time}s in writes, exceeding the 3s limit (via Action API globalpreferences module)
Closed, DeclinedPublicPRODUCTION ERROR

Description

List of steps to reproduce (step by step, including full links if applicable):
On betacommons:

action	globalpreferences
format	json
change	userjs-BawlTackOnEchoGlobal={"1656080881662_0":{"icon":"svgBawlBlackCopyIcon","user":"AJ","link":"","msg":"copy"}}
token	[redacted]

What happens?:

{"globalpreferences":"success"}

[YrXKTcDtZadeuZa146t7CgAAAA0] /w/api.php Wikimedia\Rdbms\DBTransactionSizeError: Transaction spent {time}s in writes, exceeding the 3s limit

Backtrace:

from /srv/mediawiki/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1701)
#0 /srv/mediawiki/php-master/includes/libs/rdbms/lbfactory/LBFactory.php(320): Wikimedia\Rdbms\LoadBalancer->approvePrimaryChanges(array, string)
#1 /srv/mediawiki/php-master/includes/MediaWiki.php(679): Wikimedia\Rdbms\LBFactory->commitPrimaryChanges(string, array)
#2 /srv/mediawiki/php-master/includes/api/ApiMain.php(901): MediaWiki::preOutputCommit(DerivativeContext)
#3 /srv/mediawiki/php-master/includes/api/ApiMain.php(846): ApiMain->executeActionWithErrorHandling()
#4 /srv/mediawiki/php-master/api.php(90): ApiMain->execute()
#5 /srv/mediawiki/php-master/api.php(45): wfApiMain()
#6 /srv/mediawiki/w/api.php(3): require(string)
#7 {

What should have happened instead?:
Just success.

Event Timeline

Marostegui subscribed.

We (DBAs) cannot really do much about it, if the write takes longer than 3 seconds, MW aborts it. Maybe Platform Engineering can help.
All our masters look fine at the moment.

I should have checked if it really failed or not, but if it did, why did it say {"globalpreferences":"success"}?

I'm thinking maybe this is a kind of edit conflict, my script changes two different preferences at practically the same time. I wouldn't expect an issue since it's not the same preference, but maybe it's an issue anyway.

On top of that, beta cluster has its own issues and might be related to many of its problem. It can be still reproducible in production but I haven't seen that.

I'm thinking maybe this is a kind of edit conflict, my script changes two different preferences at practically the same time. I wouldn't expect an issue since it's not the same preference, but maybe it's an issue anyway.

Point 1:
The preferences are handled as one. To change one preference all preferences of the userid are read from the database, the change is applied to that data and the according changes on the database are done with DELETE + INSERT. It seems the global preferences is using REPLACE + needed DELETE, not sure if that makes different on locking.

Point 2:
According to the Etiquete https://www.mediawiki.org/wiki/API:Etiquette avoid parallel requests, just combine both in one query to improve performance

I'm thinking maybe this is a kind of edit conflict, my script changes two different preferences at practically the same time. I wouldn't expect an issue since it's not the same preference, but maybe it's an issue anyway.

Point 1:
The preferences are handled as one. To change one preference all preferences of the userid are read from the database, the change is applied to that data and the according changes on the database are done with DELETE + INSERT. It seems the global preferences is using REPLACE + needed DELETE, not sure if that makes different on locking.

Point 2:
According to the Etiquete https://www.mediawiki.org/wiki/API:Etiquette avoid parallel requests, just combine both in one query to improve performance

That's not always an option. (though I try to do that more now) If one preference needs to be updated and you know that MAYBE further down the code you'll change another, what would you do? You don't know yet if you'll be going down that path.

When the user marks individual notifications as read in quick succession within my script (and these notifications are stored as a preference), what am I supposed to do? Can't predict if the human is going to click another thing.

I'm now trying to queue these things which introduces its own problems. Edit: I think that's working now.

Besides, when the user enables/disables https://www.mediawiki.org/wiki/Extension:CodeEditor or https://www.mediawiki.org/wiki/Extension:CodeMirror a preference is updated beyond my control.

MusikAnimal subscribed.

Until we see an error in production, I think it's safe to assume this probably isn't an issue with GlobalPreferences, specifically. Please correct me and/or re-tag if anyone feels otherwise!

It seems I git this bug while editing my talk page on Commons:

[f44aead8-ae05-4f50-870d-6e9d8410df7e] 2024-05-06 19:14:19: Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError"

And again

[73c6fdec-6349-48df-83f1-4f77fda80cc6] 2024-05-06 19:17:36: Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError"

It seems I git this bug while editing my talk page on Commons:

[f44aead8-ae05-4f50-870d-6e9d8410df7e] 2024-05-06 19:14:19: Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError"

I'm guessing that's the same error for a different problem.

Again

[60dc03b0-4dc4-44ad-9214-d133d974fe21] 2024-05-09 17:53:09: Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError"
Umherirrender renamed this task from DBTransactionSizeError: Transaction spent {time}s in writes, exceeding the 3s limit to DBTransactionSizeError: Transaction spent {time}s in writes, exceeding the 3s limit (via Action API globalpreferences module).May 9 2024, 6:07 PM

Again

[60dc03b0-4dc4-44ad-9214-d133d974fe21] 2024-05-09 17:53:09: Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError"

Please create a separate ticket, this is not related to the recently increased transaction time in s4.

Krinkle subscribed.

I'm unable to find cases of DBTransactionSizeError in the last 30 days on globalpreferences API. If you see it again, make sure to report the request ID (e.g. YrXKTcDtZadeuZa146t7CgAAAA0).

And to the person with production access that reads that report, make sure to search for it in Logstash and look for other messages from the same reqId:, in particular the rdbms warnings right before the DBTransactionSizeError which say what query this is about, including a useful stack trace. For example, something like:

Expectation (readQueryTime <= 5)
SELECT foo FROM Bar
…
from /srv/mediawiki/
#0 …
#1 …
Restricted Application changed the subtype of this task from "Bug Report" to "Production Error". · View Herald TranscriptMar 22 2025, 9:14 AM