Page MenuHomePhabricator

internal_api_error_DBQueryError on page edit; db1024
Closed, ResolvedPublic

Description

My bot has been getting 'internal_api_error_DBQueryError' while trying to update a page on Finnish Wikipedia for several hours now. I tried dumping the text to a file and saving it myself, but I got the same error:

Function: WikiPage::updateRevisionOn
Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.64.16.13)

This happens when trying to update [[Wikipedia:Viikon kilpailu/Viikon kilpailu 2014-35]]. Storing the same text to a sandbox page [[Käyttäjä:Danmichaelo/Sandbox]] works fine.


Version: wmf-deployment
Severity: normal
URL: https://fi.wikipedia.org/wiki/Wikipedia:Viikon_kilpailu/Viikon_kilpailu_2014-35

Details

Reference
bz70221

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:39 AM
bzimport set Reference to bz70221.
bzimport added a subscriber: Unknown Object (MLST).

It started working again at 03:13 UTC. I got the error first at 19:13 UTC, so the lock lasted for ~ 8 hours or more. It is possible to configure such a look, whatever caused it, to timeout faster?

Do you know which specific servers were hit (like "mw1234")?

Only know the IP I included in my first message, 10.64.16.13, which seems to be db1024.

Unsure whether to put this API or DB or Page editing territory, meh.

Looking at https://fi.wikipedia.org/w/index.php?title=Wikipedia:Viikon_kilpailu/Viikon_kilpailu_2014-35&action=history I assume this is about UKBot?

(In reply to Dan Michael Heggø from comment #1)

It started working again at 03:13 UTC.

Wondering if the diff size (+40499) comes into play here...

Also see bug 37519 comment 13.

It's about UKBot, yes. However, once the page was locked, noone else could edit it either. What puzzles me the most is that the lock lasted for so many hours.

(In reply to Andre Klapper from comment #4)

Unsure whether to put this API or DB or Page editing territory, meh.

Not API, the API just calls into the core page editing code. The error itself was coming from the database layer, although the base cause for the lock wait timeout might be elsewhere (e.g. something else holding a lock for far too long).

I removed "via bot" from the bug title since comment 0 implies that the same error was received when editing via the UI.

(In reply to Brad Jorsch from comment #6)

The error itself was coming from the database layer, although the base
cause for the lock wait timeout might be elsewhere (e.g. something else
holding a lock for far too long).

Thanks.

Might be hard to track down now that the problem is gone. :-/

Aklapper changed the task status from Open to Stalled.Dec 15 2014, 10:14 AM
Aklapper lowered the priority of this task from Medium to Low.
Aklapper subscribed.
jcrespo claimed this task.
jcrespo subscribed.

After 1 year stalled, I will close it. It says "Resolved", but I am using that because it is not "Invalid" nor "Rejected", it just stopped being a problem. There could be many reasons why this happened, and after one year it is impossible to debug (while code and servers may have already changed several times). We can reopen it if it happens again.