Page MenuHomePhabricator

internal_api_error_DBQueryError when moving files
Closed, ResolvedPublic

Description

According to https://commons.wikimedia.org/wiki/MediaWiki_talk:Gadget-AjaxQuickDelete.js/auto-errors this happens from time to time. Details there. As I don't like how file moving is implemented at all (you move/rename?! - takes ages!), I don't think it's really important.


Version: unspecified
Severity: normal

Details

Reference
bz37519

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 12:29 AM
bzimport set Reference to bz37519.
bzimport added a subscriber: Unknown Object (MLST).

I believe the api should return a human readable details field (the "ìnfo" attribute/property) when that error occours, however the page you linked doesn`t include said information (Specificly, API should say in that field what the db query was that caused the error. [At least I'm 90% sure it does]) Having that information makes it significantly easier to debug/fix the problem.

Any change you could get what the "info" field was for the error reports in question?

https://commons.wikimedia.org/wiki/MediaWiki:Gadget-libAPI.js

return doErrCB("API request failed (" + result.error.code + "): " + result.error.info);

gives
API request failed (internal_api_error_DBQueryError): Database query error

You can give the timestamp the errors occurred on, and a sysadmin can check it against dberrors.log. It's probably a duplicate-key error.

(In reply to comment #3)
That's unfortunately not possible, yet. But it's a good idea. I will read the time from the response-header and add it to the error-report in future.

I can only say when the error were reported (just picked some of them):

  • 20120613080117
  • 20120606173046
  • 20120520090120

(In reply to comment #3)
Thu, 14 Jun 2012 11:27:13 GMT

Database queries are hidden from all error messages, except if $wgShowSQLErrors is set to true. Are we requesting that on this bug? If so, the component should be filed against api, not file uploads.

Pardon, I "do not request" but I would be pleased if this could be investigated and eventually fixed, no matter what component this bug is about. If you know it's the wrong component, then just change it!

Not having a SWIFT media storage of a few TB and a database with >14.000.000 files at home, I am unable to troubleshoot that one myself.

The issue still persists:
https://commons.wikimedia.org/w/index.php?title=MediaWiki_talk:Gadget-AjaxQuickDelete.js/auto-errors#Autoreport_by_AjaxQuickDelete_803840706048

OK, if the issue is still happening, then it should be investigated, but the SQL details won't be made public though the api in WMF wikis at least (that's the part that made me closing this bug).

What needs to be done here is someone with access to WMF error logs to look at that particular error and give all the details. Then someone can investigate what's the source of the error. That's something most of developers can't do, so adding +ops for someone to look at, otherwise, this bug can't be worked on.

Last reported error was on

  • Fri, 25 Apr 2014 09:41:57 GMT
  • served by mw1144
  • while moving [[:c:File:Фото0387.jpg]]

(In reply to Jesús Martínez Novo (Ciencia Al Poder) from comment #8)

What needs to be done here is someone with access to WMF error logs to look
at that particular error and give all the details. Then someone can
investigate what's the source of the error.

springle: Could you take a look at the dberrors log (and also if this is still happening after April 25)?

  • Bug 53768 has been marked as a duplicate of this bug. ***

See also bug 40927 comment 6. I'm unsure if this bug should be marked a dupe of that one.

The example error in comment 9 was:

Fri Apr 25 9:41:57 UTC 2014 mw1144 commonswiki SqlDataUpdate::invalidatePages 10.64.16.29 1213 Deadlock found when trying to get lock; try restarting transaction (10.64.16.29)

This does indeed still happen from time to time.

Deadlocks occur according to combination of transaction isolation level, number and complexity of indexes affecting table access patterns, and engine-specific things like InnoDB gap locking. Commonswiki is prone to deadlocks, and to lock-wait-timeout, due to the long-running file handling transactions which block each other.

Change 135601 had a related patch set uploaded by Aaron Schulz:
Reduce Title::invalidateCache contention a bit

https://gerrit.wikimedia.org/r/135601

Change 135601 merged by Springle:
Reduce Title::invalidateCache contention a bit

https://gerrit.wikimedia.org/r/135601